Memory efficient way of reading and downloading a large file in Ruby
To read a large file from the disk
File.foreach
method reads the file line by line; that is why it is safe to use for large files.
It can accept the block to execute each line of a file.
Example:
File.foreach('example.jsonl') { |line| JSON.parse(line) }
However, in the Ruby documentation, you will not find this method defined on the File
class. It is defined on IO
, which is a superclass of File
.
To download the large files from the Internet
We can use IO.copy_stream
to download the large files from the Internet. It will create a stream instead of loading the whole file into memory before writing.
Example:
IO.copy_stream(open('http://example.com/file.jsonl'), 'file.jsonl')
To sum it up…
Memory friendly methods:
File.foreach
, IO.copy_stream
Other methods to read a file:
File.read
, File.readlines