How to download and save a file from Internet using Java?
有一个在线文件(如
尝试Java NIO:
1 2 3 4 | URL website = new URL("http://www.website.com/information.asp"); ReadableByteChannel rbc = Channels.newChannel(website.openStream()); FileOutputStream fos = new FileOutputStream("information.html"); fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE); |
号
使用
在这里查看更多信息。
注意:transferFrom中的第三个参数是要传输的最大字节数。
使用ApacheCommons IO,只需一行代码:
。
更简单的NIO用法:
1 2 3 4 | URL website = new URL("http://www.website.com/information.asp"); try (InputStream in = website.openStream()) { Files.copy(in, target, StandardCopyOption.REPLACE_EXISTING); } |
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | public void saveUrl(final String filename, final String urlString) throws MalformedURLException, IOException { BufferedInputStream in = null; FileOutputStream fout = null; try { in = new BufferedInputStream(new URL(urlString).openStream()); fout = new FileOutputStream(filename); final byte data[] = new byte[1024]; int count; while ((count = in.read(data, 0, 1024)) != -1) { fout.write(data, 0, count); } } finally { if (in != null) { in.close(); } if (fout != null) { fout.close(); } } } |
您需要处理异常,可能是这个方法的外部异常。
下载一个文件需要你阅读它,不管是哪种方式,你都必须以某种方式浏览该文件。您可以从流中按字节读取它,而不是逐行读取:
1 2 3 4 5 6 7 | BufferedInputStream in = new BufferedInputStream(new URL("http://www.website.com/information.asp").openStream()) byte data[] = new byte[1024]; int count; while((count = in.read(data,0,1024)) != -1) { out.write(data, 0, count); } |
这是一个古老的问题,但这里有一个简洁、易读、仅限JDK的解决方案,它具有适当的封闭资源:
1 2 3 4 5 | public static void download(String url, String fileName) throws Exception { try (InputStream in = URI.create(url).toURL().openStream()) { Files.copy(in, Paths.get(fileName)); } } |
两行代码,没有依赖项。
当使用
1 2 3 4 5 6 7 8 9 | private static Path download(String sourceURL, String targetDirectory) throws IOException { URL url = new URL(sourceURL); String fileName = sourceURL.substring(sourceURL.lastIndexOf('/') + 1, sourceURL.length()); Path targetPath = new File(targetDirectory + File.separator + fileName).toPath(); Files.copy(url.openStream(), targetPath, StandardCopyOption.REPLACE_EXISTING); return targetPath; } |
此处为文档。
此答案几乎与所选答案完全相同,但有两个增强:它是一个方法,它关闭了fileoutputstream对象:
1 2 3 4 5 6 7 8 9 10 11 12 13 | public static void downloadFileFromURL(String urlString, File destination) { try { URL website = new URL(urlString); ReadableByteChannel rbc; rbc = Channels.newChannel(website.openStream()); FileOutputStream fos = new FileOutputStream(destination); fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE); fos.close(); rbc.close(); } catch (IOException e) { e.printStackTrace(); } } |
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | import java.io.*; import java.net.*; public class filedown { public static void download(String address, String localFileName) { OutputStream out = null; URLConnection conn = null; InputStream in = null; try { URL url = new URL(address); out = new BufferedOutputStream(new FileOutputStream(localFileName)); conn = url.openConnection(); in = conn.getInputStream(); byte[] buffer = new byte[1024]; int numRead; long numWritten = 0; while ((numRead = in.read(buffer)) != -1) { out.write(buffer, 0, numRead); numWritten += numRead; } System.out.println(localFileName +"\t" + numWritten); } catch (Exception exception) { exception.printStackTrace(); } finally { try { if (in != null) { in.close(); } if (out != null) { out.close(); } } catch (IOException ioe) { } } } public static void download(String address) { int lastSlashIndex = address.lastIndexOf('/'); if (lastSlashIndex >= 0 && lastSlashIndex < address.length() - 1) { download(address, (new URL(address)).getFile()); } else { System.err.println("Could not figure out local file name for"+address); } } public static void main(String[] args) { for (int i = 0; i < args.length; i++) { download(args[i]); } } } |
就我个人而言,我发现Apache的httpclient能够胜任与此相关的所有工作。下面是一个关于使用httpclient的很好的教程
这是另一个Java7变体,基于Brian Risk的答案,使用Try With语句:
1 2 3 4 5 6 7 8 9 10 11 | public static void downloadFileFromURL(String urlString, File destination) throws Throwable { URL website = new URL(urlString); try( ReadableByteChannel rbc = Channels.newChannel(website.openStream()); FileOutputStream fos = new FileOutputStream(destination); ){ fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE); } } |
。
这里有许多优雅而有效的答案。但是简洁性会使我们丢失一些有用的信息。特别是,人们通常不希望将连接错误视为异常,并且可能希望以不同的方式处理某些与网络相关的错误-例如,决定是否应重试下载。
这里有一个方法,它不会因网络错误而引发异常(仅适用于真正的异常问题,如格式错误的URL或写入文件的问题)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | /** * Downloads from a (http/https) URL and saves to a file. * Does not consider a connection error an Exception. Instead it returns: * * 0=ok * 1=connection interrupted, timeout (but something was read) * 2=not found (FileNotFoundException) (404) * 3=server error (500...) * 4=could not connect: connection timeout (no internet?) java.net.SocketTimeoutException * 5=could not connect: (server down?) java.net.ConnectException * 6=could not resolve host (bad host, or no internet - no dns) * * @param file File to write. Parent directory will be created if necessary * @param url http/https url to connect * @param secsConnectTimeout Seconds to wait for connection establishment * @param secsReadTimeout Read timeout in seconds - trasmission will abort if it freezes more than this * @return See above * @throws IOException Only if URL is malformed or if could not create the file */ public static int saveUrl(final Path file, final URL url, int secsConnectTimeout, int secsReadTimeout) throws IOException { Files.createDirectories(file.getParent()); // make sure parent dir exists , this can throw exception URLConnection conn = url.openConnection(); // can throw exception if bad url if( secsConnectTimeout > 0 ) conn.setConnectTimeout(secsConnectTimeout * 1000); if( secsReadTimeout > 0 ) conn.setReadTimeout(secsReadTimeout * 1000); int ret = 0; boolean somethingRead = false; try (InputStream is = conn.getInputStream()) { try (BufferedInputStream in = new BufferedInputStream(is); OutputStream fout = Files .newOutputStream(file)) { final byte data[] = new byte[8192]; int count; while((count = in.read(data)) > 0) { somethingRead = true; fout.write(data, 0, count); } } } catch(java.io.IOException e) { int httpcode = 999; try { httpcode = ((HttpURLConnection) conn).getResponseCode(); } catch(Exception ee) {} if( somethingRead && e instanceof java.net.SocketTimeoutException ) ret = 1; else if( e instanceof FileNotFoundException && httpcode >= 400 && httpcode < 500 ) ret = 2; else if( httpcode >= 400 && httpcode < 600 ) ret = 3; else if( e instanceof java.net.SocketTimeoutException ) ret = 4; else if( e instanceof java.net.ConnectException ) ret = 5; else if( e instanceof java.net.UnknownHostException ) ret = 6; else throw e; } return ret; } |
号
可以用Apache的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | public static boolean saveFile(URL fileURL, String fileSavePath) { boolean isSucceed = true; CloseableHttpClient httpClient = HttpClients.createDefault(); HttpGet httpGet = new HttpGet(fileURL.toString()); httpGet.addHeader("User-Agent","Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0"); httpGet.addHeader("Referer","https://www.google.com"); try { CloseableHttpResponse httpResponse = httpClient.execute(httpGet); HttpEntity fileEntity = httpResponse.getEntity(); if (fileEntity != null) { FileUtils.copyInputStreamToFile(fileEntity.getContent(), new File(fileSavePath)); } } catch (IOException e) { isSucceed = false; } httpGet.releaseConnection(); return isSucceed; } |
号
与单行代码相比:
1 2 |
号
此代码将为您提供对进程的更多控制,并允许您不仅指定超时值,还指定对许多网站至关重要的
在下划线Java库中有方法U.fetch(URL)。
pom.xml文件:
1 2 3 | <groupId>com.github.javadev</groupId> underscore</artifactId> <version>1.45</version> |
号
代码示例:
1 2 3 4 5 6 7 8 |
号
总结(并以某种方式润色和更新)以前的答案。以下三种方法实际上是等效的。(我添加了显式超时,因为我认为它们是必须的,没有人希望在连接断开时下载永久冻结。)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | public static void saveUrl1(final Path file, final URL url, int secsConnectTimeout, int secsReadTimeout)) throws MalformedURLException, IOException { // Files.createDirectories(file.getParent()); // optional, make sure parent dir exists try (BufferedInputStream in = new BufferedInputStream( streamFromUrl(url, secsConnectTimeout,secsReadTimeout) ); OutputStream fout = Files.newOutputStream(file)) { final byte data[] = new byte[8192]; int count; while((count = in.read(data)) > 0) fout.write(data, 0, count); } } public static void saveUrl2(final Path file, final URL url, int secsConnectTimeout, int secsReadTimeout)) throws MalformedURLException, IOException { // Files.createDirectories(file.getParent()); // optional, make sure parent dir exists try (ReadableByteChannel rbc = Channels.newChannel( streamFromUrl(url, secsConnectTimeout,secsReadTimeout) ); FileChannel channel = FileChannel.open(file, StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE) ) { channel.transferFrom(rbc, 0, Long.MAX_VALUE); } } public static void saveUrl3(final Path file, final URL url, int secsConnectTimeout, int secsReadTimeout)) throws MalformedURLException, IOException { // Files.createDirectories(file.getParent()); // optional, make sure parent dir exists try (InputStream in = streamFromUrl(url, secsConnectTimeout,secsReadTimeout) ) { Files.copy(in, file, StandardCopyOption.REPLACE_EXISTING); } } public static InputStream streamFromUrl(URL url,int secsConnectTimeout,int secsReadTimeout) throws IOException { URLConnection conn = url.openConnection(); if(secsConnectTimeout>0) conn.setConnectTimeout(secsConnectTimeout*1000); if(secsReadTimeout>0) conn.setReadTimeout(secsReadTimeout*1000); return conn.getInputStream(); } |
号
我没有发现显著的差异,对我来说都是正确的。它们安全高效。(速度上的差异似乎无关紧要——我将180MB从本地服务器写入到一个SSD磁盘的时间波动在1.2到1.5段左右)。它们不需要外部库。所有这些都可以使用任意大小和(根据我的经验)HTTP重定向。
此外,如果找不到资源(通常是错误404),则全部抛出
(标记为社区维基,请随意添加信息或更正)
简单用法存在问题:
。
如果需要下载和保存非常大的文件,或者在连接中断的情况下,通常需要自动重试。
在这种情况下,我建议使用Apachehttpclient和org.apache.commons.io.fileutils。例如:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | GetMethod method = new GetMethod(resource_url); try { int statusCode = client.executeMethod(method); if (statusCode != HttpStatus.SC_OK) { logger.error("Get method failed:" + method.getStatusLine()); } org.apache.commons.io.FileUtils.copyInputStreamToFile( method.getResponseBodyAsStream(), new File(resource_file)); } catch (HttpException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { method.releaseConnection(); } |
下面是用Java代码从Internet下载电影的示例代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 | URL url = new URL("http://103.66.178.220/ftp/HDD2/Hindi%20Movies/2018/Hichki%202018.mkv"); BufferedInputStream bufferedInputStream = new BufferedInputStream(url.openStream()); FileOutputStream stream = new FileOutputStream("/home/sachin/Desktop/test.mkv"); int count=0; byte[] b1 = new byte[100]; while((count = bufferedInputStream.read(b1)) != -1) { System.out.println("b1:"+b1+">>"+count+">> KB downloaded:"+new File("/home/sachin/Desktop/test.mkv").length()/1024); stream.write(b1, 0, count); } |
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | public class DownloadManager { static String urls ="[WEBSITE NAME]"; public static void main(String[] args) throws IOException{ URL url = verify(urls); HttpURLConnection connection = (HttpURLConnection) url.openConnection(); InputStream in = null; String filename = url.getFile(); filename = filename.substring(filename.lastIndexOf('/') + 1); FileOutputStream out = new FileOutputStream("C:\\Java2_programiranje/Network/DownloadTest1/Project/Output" + File.separator + filename); in = connection.getInputStream(); int read = -1; byte[] buffer = new byte[4096]; while((read = in.read(buffer)) != -1){ out.write(buffer, 0, read); System.out.println("[SYSTEM/INFO]: Downloading file..."); } in.close(); out.close(); System.out.println("[SYSTEM/INFO]: File Downloaded!"); } private static URL verify(String url){ if(!url.toLowerCase().startsWith("http://")) { return null; } URL verifyUrl = null; try{ verifyUrl = new URL(url); }catch(Exception e){ e.printStackTrace(); } return verifyUrl; } } |
号
您可以在1行中使用Java的NETLoad进行操作:
1 | new NetFile(new File("my/zips/1.zip"),"https://example.com/example.zip", -1).load(); //returns true if succeed, otherwise false. |
号
如果你在代理后面,你可以在Java程序中设置代理如下:
1 2 3 4 | Properties systemSettings = System.getProperties(); systemSettings.put("proxySet","true"); systemSettings.put("https.proxyHost","https proxy of your org"); systemSettings.put("https.proxyPort","8080"); |
如果您不在代理之后,请不要在代码中包含上面的行。完整的工作代码,用于在代理之后下载文件。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | public static void main(String[] args) throws IOException { String url="https://raw.githubusercontent.com/bpjoshi/fxservice/master/src/test/java/com/bpjoshi/fxservice/api/TradeControllerTest.java"; OutputStream outStream=null; URLConnection connection=null; InputStream is=null; File targetFile=null; URL server=null; //Setting up proxies Properties systemSettings = System.getProperties(); systemSettings.put("proxySet","true"); systemSettings.put("https.proxyHost","https proxy of my organisation"); systemSettings.put("https.proxyPort","8080"); //The same way we could also set proxy for http System.setProperty("java.net.useSystemProxies","true"); //code to fetch file try { server=new URL(url); connection = server.openConnection(); is = connection.getInputStream(); byte[] buffer = new byte[is.available()]; is.read(buffer); targetFile = new File("src/main/resources/targetFile.java"); outStream = new FileOutputStream(targetFile); outStream.write(buffer); } catch (MalformedURLException e) { System.out.println("THE URL IS NOT CORRECT"); e.printStackTrace(); } catch (IOException e) { System.out.println("Io exception"); e.printStackTrace(); } finally{ if(outStream!=null) outStream.close(); } } |
号