How to open very large text files on Windows

Some graphical tools and two command line tips


I’ve had to search the occurrency of a string within some very large text files, as result of a “file carving” made with Autopsy.

Usually on Windows I use Notepad ++, that provides a convenient feature of ‘Search in files’, but this great tool has difficulty to open files larger than 2Gb.

However there are some other solutions on Windows:

  • gVim: you need to be familiar with VI/ VIM commands to use it, and loads entire file into memory.
  • 010Editor: Opens giant (think 5 GB) files in binary mode and allow you to edit and search the text
  • Liquid XML Community Edition Opens and edits TB+ files instantly, supports UTF-8, Unicode etc
  • SlickEdit: Useful IDE that can open very large files
  • Emacs: Must be compiled in 64Bit mode: has a low maximum buffer size limit if compiled in 32-bit mode.
  • glogg: Read only, allows search using regular expressions.
  • PilotEdit: Loads entire file into memory first
  • HxD: Hex editor, good for large files: portable version available
  • LogExpert: opens smoothly log files greater than 6GB
  • FileSeek: It can find text strings, or match regular expressions.

Furthermore, if you feel comfortable using the command line, there are some console solutions (built-in on Windows):

  • The more command might be good enough:
Displays output one screen at a time.
MORE [/E [/C] [/P] [/S] [/Tn] [+n]] < [drive:][path]filename
command-name | MORE [/E [/C] [/P] [/S] [/Tn] [+n]]
MORE /E [/C] [/P] [/S] [/Tn] [+n] [files]
[drive:][path]filename Specifies a file to display one
screen at a time.
command-name Specifies a command whose output
will be displayed.
/E Enable extended features
/C Clear screen before displaying page
/P Expand FormFeed characters
/S Squeeze multiple blank lines into a single line
/Tn Expand tabs to n spaces (default 8)
Switches can be present in the MORE environment
variable.
+n Start displaying the first file at line n
files List of files to be displayed. Files in the list
are separated by blanks.
If extended features are enabled, the following commands
are accepted at the — More — prompt:
P n Display next n lines
S n Skip next n lines
F Display next file
Q Quit
= Show line number
? Show help line
<space> Display next page
<ret> Display next line

There is also a Windows built-in program called findstr.exe with which you can search within files:

Searches for strings in files.
FINDSTR [/B] [/E] [/L] [/R] [/S] [/I] [/X] [/V] [/N] [/M] [/O] [/P] [/F:file]
[/C:string] [/G:file] [/D:dir list] [/A:color attributes] [/OFF[LINE]]
strings [[drive:][path]filename[ ...]]
/B         Matches pattern if at the beginning of a line.
/E Matches pattern if at the end of a line.
/L Uses search strings literally.
/R Uses search strings as regular expressions.
/S Searches for matching files in the current directory and all
subdirectories.
/I Specifies that the search is not to be case-sensitive.
/X Prints lines that match exactly.
/V Prints only lines that do not contain a match.
/N Prints the line number before each line that matches.
/M Prints only the filename if a file contains a match.
/O Prints character offset before each matching line.
/P Skip files with non-printable characters.
/OFF[LINE] Do not skip files with offline attribute set.
/A:attr Specifies color attribute with two hex digits. See "color /?"
/F:file Reads file list from the specified file(/ stands for console).
/C:string Uses specified string as a literal search string.
/G:file Gets search strings from the specified file(/ stands for console).
/D:dir Search a semicolon delimited list of directories
strings Text to be searched for.
[drive:][path]filename
Specifies a file or files to search.
Use spaces to separate multiple search strings unless the argument is prefixed
with /C. For example, 'FINDSTR "hello there" x.y' searches for "hello" or
"there" in file x.y. 'FINDSTR /C:"hello there" x.y' searches for
"hello there" in file x.y.
Regular expression quick reference:
. Wildcard: any character
* Repeat: zero or more occurrences of previous character or class
^ Line position: beginning of line
$ Line position: end of line
[class] Character class: any one character in set
[^class] Inverse class: any one character not in set
[x-y] Range: any characters within the specified range
x Escape: literal use of metacharacter x
<xyz Word position: beginning of word
xyz> Word position: end of word
For full information on FINDSTR regular expressions refer to the online Command
Reference.

For example:

findstr /s "Login failed" *.txt

Do you know other tools? I accept tips!

Comments