Green Lantern

I'm trying to count instances of words in a text box and display the count number and word in a listbox. Here's what I have now:

Private Sub SortUp()

Dim source As String = txtData.Text

Dim WordList As String = ""

Dim result As New System.Text.StringBuilder

Dim parser As New Regex("\b[a-z]+\b")

source = source.ToLower

txtWordList.Text = ""

Dim sourceMatches As MatchCollection = parser.Matches(source)

Dim counter As Integer

'Dim finder As Integer = sourceMatches(counter).Value.ToString()

For counter = 0 To sourceMatches.Count - 1

result.Append(sourceMatches(counter).Value.ToString()).Append(" ")

Next counter

WordList = WordList & result.ToString.Trim

Dim S() As String = Split(WordList, " ")

Array.Sort(S)

Dim Word As String

Dim Idx As Single = 0

Dim Qty As Single = 1

Dim tempWord As String = ""

txtWordList.ForeColor = Color.Black

'Dim WordCountArray As String

lstData.Items.Clear()

Dim Qty_tmp As Single = 1

For Each Word In S

If tempWord = Word Then

Qty += 1

lstData.Items.Add(Idx & " (" & Qty & ") " & Word)

'Debug.Print(Idx & " (" & Qty & ") " & Word)

Else

Idx += 1

'check for last instance

Qty = 1

'Debug.Print(Idx & " (" & Qty & ") " & Word)

lstData.Items.Add(Idx & " (" & Qty & ") " & Word)

tempWord = Word

End If

Next

End Sub

I tried a test file like this: "now hear this now this deep now" and got these results:

1 (1) deep

2 (1) hear

3 (1) now

3 (2) now

3 (3) now

4 (1) this

4 (2) this

The result I want is:

1 (1) deep

2 (1) hear

3 (3) now

4 (2) this

Can anyone suggest how to do this




Re: Visual Basic Express Edition counting words in text box.

LouieG

Hi Green Lantern

i did my own quickly just to illustrate.

Private Sub SortUp()

Dim testdata, Result(), myWord As String

Dim ResultQty(), i, x As Short

x = -1

ReDim Result(x)

ReDim ResultQty(x)

testdata = "now hear this now this deep now"

testdata = testdata.ToLower.Trim

Dim S() As String = Split(testdata, " ")

Array.Sort(S)

For Each myWord In S

i = Array.IndexOf(Result, myWord) '-- see if you have the word yet

If i > -1 Then

ResultQty(i) = ResultQty(i) + 1 '-- if you do then add to counter

Else '-- if not then add the words found

x = x + 1

ReDim Preserve ResultQty(x)

ReDim Preserve Result(x)

Result(x) = myWord

ResultQty(x) = 1

End If

Next

For i = 0 To x

MsgBox(ResultQty(i) & Result(i))

Next

End Sub

hope it helps

Louie






Re: Visual Basic Express Edition counting words in text box.

Green Lantern

LouieG,

I'm marking this as answered. This is what I need. I can massage it for my particular program now. Thanks so much.

Green Lantern






Re: Visual Basic Express Edition counting words in text box.

ReneeC

Priva Private Sub CNT()

Dim Words As New Dictionary(Of String, Integer)

Dim KVP As ICollection(Of KeyValuePair(Of String, Integer)) _

= Words

Dim A() As String = TextBox1.Text.Split(" ")

For Each wrd As String In A

If Not Words.ContainsKey(wrd) Then

Words.Add(wrd, 1)

Else

Words.Item(wrd) = Words.Item(wrd) + 1

End If

Next

For Each element As KeyValuePair(Of String, Integer) In Words

TextBox2.Text &= "(" & element.Value & ") " & element.Key & vbCrLf

Next

E End Sub






Re: Visual Basic Express Edition counting words in text box.

LouieG

Green Lantern,

look at what ReneeC does here - that is serious stuff (i think) that is why she is an Answerer and i'm a newbie, those constructs and part of the language is like a foreign language to me. we should all learn from samples like this. thanx ReneeC.

Louie






Re: Visual Basic Express Edition counting words in text box.

ReneeC

I would not pay too much attention to the stars. I came from a corporate culture that really didn't like Unix and C.

When I saw Green Lantern use regex (which certainly has its place but came from Unix-land), I thought I'd demonstrate something a little different. It's called .Net. Wink






Re: Visual Basic Express Edition counting words in text box.

Green Lantern

Thanks, ReneeC and LouieG,

These posts are great info. I thought regular expressions where good, but I've definitely got to study what you put here. BTW LouieG, I've just finished fitting your solution into my ode and it works fine. Is there a performance trade-off here i should be aware of We'll be looking at text document up to 100,000 words.

Maybe I need a new thread here, but I'm wondering if the results can be sorted on quantity, the ResultQty() string. Just a quick test using Array.Sort(ResultQty) doesn't keep this synchronized to the Reult() array.

Green Lantern




Re: Visual Basic Express Edition counting words in text box.

LouieG

Hi Green Lantern,

Sorry i'm the wrong person to ask about performance trade-offs - if i must take a guess i would say that Renee's answer would be faster - it uses the constructs etc. of VB. whereas mine is just programming to get to the same.

i'll have to re-think my solution for the sorting bit.

Louie






Re: Visual Basic Express Edition counting words in text box.

ReneeC

This is about the highest performance you will attain AND it holds the sort.

Private

Private Sub CNT()

Dim Indx As Integer

Dim Words As New ArrayList : Dim FrequencyList As New ArrayList
For Each wrd As String In TextBox1.Text.Split(" ")

Indx = GetIndex(wrd, Words)

If Indx < 0 Then

FrequencyList.Add(1)

Words.Add(wrd.ToLower)

Else

FrequencyList.Item(Indx) = FrequencyList.Item(Indx) + 1

End If

Next

Dim freq() As Object = FrequencyList.ToArray : Dim wrds() As Object = _

Words.ToArray

Array.Sort(freq, wrds) : Indx = Words.Count-1 : Words.Clear : FrequencyList.Clear

For i As Integer = 0 To Indx

TextBox2.Text &= i + 1 & ".) (" & freq(i) & ") " & wrds(i) & vbCrLf

Next

E End Sub

P Private Function GetIndex(ByVal Word As String, ByVal List As ArrayList) As Integer

For I As Integer = 0 To List.Count - 1

If Word.ToLower = List(I).tolower Then Return I

Next

Return -1

E End Function






Re: Visual Basic Express Edition counting words in text box.

LouieG

Hi Green Lantern

I had a think about the sorting part of your problem.

if it was me i would not even bother to sort the words while counting them.

When i have finished counting the words, i will put the results (The Words and the Counters) into a DataGridView. Then you can sort on both the words and the counters. The datagridview has this automatic sorting when clicking on the columnheader. it sorts all the columns at the same time - i.e. a row's data stays together.

i hope you get rid of all full-stops and commas (just thought about that 'cos "hello" and "hello," will be counted as two different words).

Louie






Re: Visual Basic Express Edition counting words in text box.

ReneeC

Louie,

That would have to be slow. Btw, my alorgithm has no optimization after the sort. Certainly a binary search on the keys would speed up processing enormously.






Re: Visual Basic Express Edition counting words in text box.

Adam D. Turner

Without using generics, here's more of an old school approach:

Code Block

Dim myArray() = TextBox1.Text.Split(" ")

Dim mycounter As Integer = 0

For x As Integer = 0 To myArray.Length - 1

For y As Integer = 0 To myArray.Length - 1

If myArray(x).ToString = myArray(y).ToString Then

mycounter += 1

End If

Next

If ListBox1.Items.IndexOf(myArray(x) & " - " & mycounter) = -1 Then

ListBox1.Items.Add(myArray(x) & " - " & mycounter)

End If

mycounter = 0

Next

Adam




Re: Visual Basic Express Edition counting words in text box.

Adam D. Turner

In all practicality, it should coded to be scalable. Although Renee's approach is new age, the multitude of looping would prove to be a performance nightmare as the application grew.

The ultimate approach would be to use an xml file that you can query against. Much more efficient.

I can post the code if you like.

Adam






Re: Visual Basic Express Edition counting words in text box.

Green Lantern

I had to get to bed after my last post. (It was after 1AM here). So I've got some catching up to do on these replies.

I'm using regex because the Split() function is limiting. I need to pull out words only without things like "deep" vs deep vs (deep). I notice I need to tweak the expression because tip-top comes out as two words, tip and top.

Adam, you've really got my interest with this xml document idea. Yes, I'd appreciate seeing that code.

Thanks to all of you for your help.

Green Lantern






Re: Visual Basic Express Edition counting words in text box.

ReneeC

"Although Renee's approach is new age, the multitude of looping would prove to be a performance nightmare as the application grew."

Green lantern got to bed at 1. I got to bed at four. I have already noted that there are things that can be done to speed up the looping.

Adamus, no matter how old you are, I'm very sure I go back further YET, it's suggested that my code is "New Age" I know the really good .Net Engineers take great pride in their code which actually reflects a concern for quality in what they do. I know the edge of the software universe is slipping away from me. It's growing faster than anyone can learn it. But I am appreciative when answering a question requires me to grow and do something that I have not done before so that tomorrow I will be just a tiny bit more able than I am today.

May my code continue to be "New Age" and may I continue to grow in my understanding and what I can do.

"There's more to life than just DO loops". There are all kinds of structures to support our undertakings. Why not utilize them wisely and expeditiously in order to be more creative.