3 回答

TA貢獻2041條經(jīng)驗 獲得超4個贊
對于PowerShell而言,這是一項容易完成的任務(wù),并由于標準Get-Content cmdlet不能很好地處理非常大的文件而變得復(fù)雜。我建議做的是使用.NET StreamReader類在PowerShell腳本中逐行讀取文件,并使用Add-Contentcmdlet將每一行寫入文件名中索引不斷增加的文件。像這樣:
$upperBound = 50MB # calculated by Powershell
$ext = "log"
$rootName = "log_"
$reader = new-object System.IO.StreamReader("C:\Exceptions.log")
$count = 1
$fileName = "{0}{1}.{2}" -f ($rootName, $count, $ext)
while(($line = $reader.ReadLine()) -ne $null)
{
Add-Content -path $fileName -value $line
if((Get-ChildItem -path $fileName).Length -ge $upperBound)
{
++$count
$fileName = "{0}{1}.{2}" -f ($rootName, $count, $ext)
}
}
$reader.Close()

TA貢獻1812條經(jīng)驗 獲得超5個贊
與此處的所有答案相同,但使用StreamReader / StreamWriter分割新行(逐行,而不是嘗試一次將整個文件讀入內(nèi)存)。這種方法可以以我所知道的最快方式拆分大文件。
注意:我很少進行錯誤檢查,因此無法保證它會根據(jù)您的情況順利進行。它為我做的(1.7 GB TXT文件的400萬行在95秒內(nèi)分成了每個文件100,000行)。
#split test
$sw = new-object System.Diagnostics.Stopwatch
$sw.Start()
$filename = "C:\Users\Vincent\Desktop\test.txt"
$rootName = "C:\Users\Vincent\Desktop\result"
$ext = ".txt"
$linesperFile = 100000#100k
$filecount = 1
$reader = $null
try{
$reader = [io.file]::OpenText($filename)
try{
"Creating file number $filecount"
$writer = [io.file]::CreateText("{0}{1}.{2}" -f ($rootName,$filecount.ToString("000"),$ext))
$filecount++
$linecount = 0
while($reader.EndOfStream -ne $true) {
"Reading $linesperFile"
while( ($linecount -lt $linesperFile) -and ($reader.EndOfStream -ne $true)){
$writer.WriteLine($reader.ReadLine());
$linecount++
}
if($reader.EndOfStream -ne $true) {
"Closing file"
$writer.Dispose();
"Creating file number $filecount"
$writer = [io.file]::CreateText("{0}{1}.{2}" -f ($rootName,$filecount.ToString("000"),$ext))
$filecount++
$linecount = 0
}
}
} finally {
$writer.Dispose();
}
} finally {
$reader.Dispose();
}
$sw.Stop()
Write-Host "Split complete in " $sw.Elapsed.TotalSeconds "seconds"
分割1.7 GB文件的輸出:
...
Creating file number 45
Reading 100000
Closing file
Creating file number 46
Reading 100000
Closing file
Creating file number 47
Reading 100000
Closing file
Creating file number 48
Reading 100000
Split complete in 95.6308289 seconds
- 3 回答
- 0 關(guān)注
- 963 瀏覽
添加回答
舉報