chilehed


I need my code to be able to deal with .csv files of well over 65k rows, but haven't the slightest idea how to do it.

I've heard of a technique called "paging", which distributes the data over more than one sheet but processes it as if it's on one.

How is this done

Thanks.




Re: How do I deal with files of more than 65k rows?

magicalclick


Well, you need to read the file yourself. And then populate each cell into multiple sheets.

Or

You can split up the csv file into multiple csv files and open them using Excel csv reader.

I personally prefer the first approach because I need to put ' in front of zip code or zero leading IDs.






Re: How do I deal with files of more than 65k rows?

chilehed

I have a workable code to open the files and process my data using a limited data set, but the actual working files can have an arbitrarily high number of lines (as many as 2 or 3 million). Cutting the data at an arbitrary location causes issues, so together with the file size it can be cumbersome to do it manually.

I could really use something that automatically opens the data into a new sheet if it overruns the 65k line limit. Or reads the data in the file without actually opening it. Or something.






Re: How do I deal with files of more than 65k rows?

magicalclick

I am not sure how can that be bad. You can read the data and export it to a predefined paths. You are not changing the original file. But With 2 to 3 million rows of data, I don't think opening them in multiple sheet is a such good idea. Your code will be too slow since it is dealing with too much data on spread sheet. Have you consider reading the file using SQL If your actual working data is just a small portion of your source, you can run query to filter out the unwanted data before you populate your worksheet.




Re: How do I deal with files of more than 65k rows?

chilehed

It's time-based data, a series of step inputs are applied to a machine and the response is recorded at a rate of 10 milliseconds for the 5 or 10 seconds that the test runs. Some of the tests abort, and some don't, but all of the data is recorded because you have to pay attention to operating the machine which means you can't delete the aborted tests as you go along. The code goes line by line through the data, ignoring an aborted test when it gets to an abort flag and calculating a bunch of things from the signals.

Everyone here uses Excel a lot so there are reasons for doing it in VBA. The tool works really well until I run out of rows....I guess maybe I'll have to learn C or Matlab.





Re: How do I deal with files of more than 65k rows?

Peter Mo.

Hi

Do you really need to be working with all 2-3 million rows, or is it just 5 to 10 seconds of data you want to work with at once Once you've processed the data can you "discard" it

If you really need all the data then consider storing it in an Access or other database. You can still process it using Excel/VBA.

If you can get away with just 5 to 10 seconds then read that much in, process it and throw it way before you read in the next 5 to 10 seconds worth.

Regards

Peter Mo.





Re: How do I deal with files of more than 65k rows?

chilehed

Perter,

I could do that. One of the data columns has a flag to indicate the test completed - it could read until it gets to a set flag, process what it had, write the results to a destination sheet and then continue reading.

Unfortunately I haven't yet been able to find out how to do that.

Tom





Re: How do I deal with files of more than 65k rows?

Peter Mo.

Hi Tom

Try this ... first set up a reference to Microsoft Scripting Runtime (in VBA Tools/References). The following simple code should allow you to work through the data a set at a time. You will have to add your processing code. I'm not sure how long it will take with 2-3 million rows, so you might want to add some code that remembers where you got to so you can restart from a particular point. I suggest you develop/test with small amounts of data.

Code Snippet

Dim xlApp As Excel.Application

Dim scrFSO As Scripting.FileSystemObject
Dim scrTxt As Scripting.TextStream

Dim wbData As Workbook
Dim wsData As Worksheet

Dim wbResult As Workbook
Dim wsResult As Worksheet


Sub Main()

Dim strFN As String
Dim rwData As Long
Dim clData As Long
Dim strLine As String
Dim arrLine As Variant

' initialise key objects

Set xlApp = Excel.Application
Set scrFSO = New Scripting.FileSystemObject

' get the name of the cSV file

strFN = xlApp.GetOpenFilename(filefilter:="CSV File (*.csv), *.csv")

' did the user cancel

If strFN = "False" Then Exit Sub

' open the csv file

Set scrTxt = scrFSO.OpenTextFile(Filename:=strFN, IOMode:=ForReading)

' create a workbook for the data

Set wbData = xlApp.Workbooks.Add(xlWBATWorksheet)
Set wsData = wbData.Sheets(1)

' create a workbook for the results (or whatever you want)

Set wbResult = xlApp.Workbooks.Add(xlWBATWorksheet)
Set wsResult = wbResult.Sheets(1)

' now work through the csv

Do While Not scrTxt.AtEndOfStream

' make sure the data worksheet is clear

wsData.Cells.Clear
rwData = 0

' loop through the data

Do

' read the data

strLine = scrTxt.ReadLine
arrLine = Split(strLine, ",")

' store it in the data worksheet

rwData = rwData + 1
For clData = 0 To UBound(arrLine)
wsData.Cells(rwData, clData + 1) = arrLine(clData)
Next clData

' test for end of data - your code

' If <end of data set> Then Exit Do

Loop

' now process the data - your code
' writing the results to wsResults (if you want)

Loop

' OK finished - tidy up

scrTxt.Close

wbData.Close savechanges:=False

MsgBox "Finished"

' leaves the results workbook open for the user to save

End Sub

Good luck

Regards

Peter Mo.





Re: How do I deal with files of more than 65k rows?

chilehed

Peter,

I think I see how that works, and it looks to be what I need. In the last week or so a bunch of other work has hit my desk so it'll take a while for me to really figure this out and integrate it into what I have. I'll post back when I have something to report.

Again, thanks a whole lot.

Tom