Littletommy


Hi all........

I have a table field that contains things such as.....

V DISCONTINUED W/O REPLACEMENT; V DISCONTINUED W/O REPLACEMENT

-I&S MASTER; 7 SUB 2610-139-3704; I&S MASTER; 7 SUB FOR 2610-139-3704

What I need to do is remove the duplicate phrases, leaving only one instance of each phrase in the field. The amount of phrases in the field varies...could be 2 as shown above or there could be 6. I suspect that the 'common denominator' is the semi-colon, but everything I've tried has failed.

Any help is greatly appreciated!




Re: How to remove redundant phrase in field?

dni


You may create a loop (do ...while) and read the field and compare strings, or if you have a limited number of phrases you may use in loop, function RAT().





Re: How to remove redundant phrase in field?

Littletommy

Thanks...but can you possibly give me an example of this






Re: How to remove redundant phrase in field?

dni

Yes, but let me know first, if you know all redundant phrases.




Re: How to remove redundant phrase in field?

Littletommy

Unfortunately, there are literally hundreds of thousands of possible phrases. Somehow I need to look at 'field1' in every record (several million records) and allow only one instance of a given phrase.. ..note: there can be more than one phrase in 'field1', but the same phrase cannot be there more than once.

thanks much for the help!





Re: How to remove redundant phrase in field?

dni

Who is generating the phrases and add them to field1




Re: How to remove redundant phrase in field?

dni

You may run this for each phrase (I did not tested it!!!):

SET EXACT ON
STORE 0 TO b
STORE 0 TO cnt0
STORE "" TO newf1
STORE "PHRASE 1" TO x1 &&*(phrase)
STORE <FIELD1> TO f1 &&field1
STORE LEN(x1) TO x2
STORE OCCURS(x1,f1) TO cnt
DO WHILE .T.
b=AT(x1,f1)
newf1=newf1+SUBSTR(f1,1,b)+SUBSTR(f1,b+x2)
cnt0=cnt0=cnt0+1
IF cnt0>cnt
EXIT
ENDIF
ENDDO
REPLACE field1 WITH newf1
SET EXACT OFF

When a new phrase appers you may store it in x1 and run the code.






Re: How to remove redundant phrase in field?

CDFAUX

First you should determine what a sentence is, for example:
- From a delimiter (start, ";", end) to another.
- A certain quantity of characters.
With delimiters can make a surer and faster code.
Taking N characters the code would be slower but perhaps it serves you, for example:

LOCAL x as Character
LOCAL i as Integer
LOCAL iEnd as Integer
LOCAL iStart as Integer
LOCAL iMinSize as Integer

LOCAL iPartial as Character
LOCAL iSize as integer

iMinSize = 4
iSize = iMinSize

x="Hola Carlos como te va Hola Carlos Hola Carlos como te va como te va "
iStart = 1
iEnd = LEN(x)

DO WHILE iStart + iSize < iEnd
iPartial = SUBSTR(x, iStart, iSize)
IF OCCURS(iPartial, x) > 1
DO WHILE OCCURS(iPartial, x) > 1 AND iStart + iSize < iEnd
iSize = iSize + 1
iPartial = SUBSTR(x, iStart, iSize)
ENDDO
iPartial = SUBSTR(x, iStart, iSize-1)
x=STRTRAN(x, iPartial, "", 1,1)
ELSE
iStart = iStart + 1
ENDIF
iSize = iMinSize
ENDDO

MESSAGEBOX(x)





Re: How to remove redundant phrase in field?

Littletommy

The data is extracted from a govt stock number database...unfortunately there is no way to control how the phrases appear...other that the fact that they are separtaed by semi-colons.





Re: How to remove redundant phrase in field?

CDFAUX

I have proven my code with your example and I have obtained:




Re: How to remove redundant phrase in field?

Littletommy

Thanks...that looks like what I need! But being the novice that I am with this stuff I have not been able to get it to work. Guess I'm having trouble translating the code.





Re: How to remove redundant phrase in field?

CDFAUX

Don't worry, we all were novices at some time.
Perhaps you understand this version better.

Saludos!




Re: How to remove redundant phrase in field?

CDFAUX

Please make several tests and revise the results well, since you could ruin the data easily with this method.



Re: How to remove redundant phrase in field?

SharathMum

assuming that data is
"V DISCONTINUED W/O REPLACEMENT; V DISCONTINUED W/O REPLACEMENT; I&S MASTER; 7 SUB 2610-139-3704; I&S MASTER; 7 SUB FOR 2610-139-3704;"
Delimited by ; and eliminate data which is same e.g V DISCONTINUED W/O REPLACEMENT must appear only once and
7 SUB 2610-139-3704 and 7 SUB FOR 2610-139-3704 are not the same and both must be retained

It assumes ; at the end of the string

CREATE CURSOR curImpData ( Datavalue c(100) )

x="V DISCONTINUED W/O REPLACEMENT; V DISCONTINUED W/O REPLACEMENT; I&S MASTER; 7 SUB 2610-139-3704; I&S MASTER; 7 SUB FOR 2610-139-3704;"

DO WHILE LEN(m.x) > 0

INSERT INTO curImpData VALUES ( ALLTRIM( SUBSTR(m.x, 1, AT(";",m.x)-1) ) )

x=SUBSTR(m.x, AT(";",m.x)+1)

ENDDO

SELECT distinct datavalue FROM curImpData INTO CURSOR curOutPut

curOutPut will only have unique value

if u want to rebuild the string than

SELECT curOutPut
x=""
SCAN
 x=x+ALLTRIM(DataValue)+";"
ENDSCAN









Re: How to remove redundant phrase in field?

Littletommy

Hello CDFAUX

Thought this worked for me, but here's what I got:

This.........

- I&S MASTER; R REFER TO AFT095REQ; R REFER TO AFT95; R REFER TO DATA REQUIRED; R REFER TO ITEM-SERIAL #; R REFER TO LIFE CONSUMED; R REFER TO T.O. 00-20-3; R REFER TO TIME CHANGE; 7 SUB FOR 2840-01-230-86

Became this........

-I&S MASTER095REQAFT95DATA REQUIRITEM-SERIAL #LIFE CONSUMED.O. 00-20-3; RREFER TO TIME CHANGE; 7 SUB FOR 2840-01-230-86

Guess I could say it's close(r)...but no cigar.