You are here: Re: how to delete lines from text file « MsSQL Server « IT news, forums, messages
Re: how to delete lines from text file

Posted by Ted Davis on 10/10/06 00:39

On 8 Oct 2006 21:34:22 -0700, "batman" <uspensky@gmail.com> wrote:

>here is some actual content from the file (shortened ofcourse to just a
>few records) ideally i would like to avoid using a batch file and keep
>it all on the sql level (sql 2000).... the file pasted in here much
>perttier than it looks in notepad (with the squares)
>
>thanks for ur help
>
>
>------ file starts below
>START-OF-FILE
>PROGRAMNAME=getdata
>DATEFORMAT=yyyymmdd
>
>START-OF-FIELDS
># Security Description
>TICKER
>EXCH_CODE
>NAME
>COUNTRY
>CRNCY
>SECURITY_TYP
>PAR_AMT
>EQY_PRIM_EXCH
>EQY_PRIM_EXCH_SHRT
>
># Industry Classification
>EQY_SIC_CODE
>EQY_SIC_NAME
>INDUSTRY_GROUP
>INDUSTRY_SUBGROUP
>INDUSTRY_SECTOR
>
># Identifiers
>ID_SEDOL1
>ID_WERTPAPIER
>ID_ISIN
>ID_DUTCH
>ID_VALOREN
>ID_FRENCH
>ID_BELGIUM
>ID_BB_COMPANY
>ID_BB_SECURITY
>ID_CUSIP
>ID_COMMON
>
># ADRs
>ADR_UNDL_TICKER
>ADR_SH_PER_ADR
>
># Dividend Information
>DVD_CRNCY
>EQY_DVD_SH_12M_NET
>EQY_DVD_SH_12M
>EQY_DVD_SH_LAST
>EQY_LAST_DPS_GROSS
>EQY_DVD_PCT_FRANKED
>EQY_DVD_TYP_LAST
>EQY_DVD_FREQ
>DVD_PAY_DT
>DVD_RECORD_DT
>DVD_DECLARED_DT
>EQY_SPLIT_DT
>EQY_SPLIT_RATIO
>DVD_EX_DT
>EQY_DVD_EX_FLAG
>
>INDUSTRY_SUBGROUP_NUM
>CNTRY_ISSUE_ISO
>MARKET_STATUS
>ID_BB_PARENT_CO
>ADR_UNDL_CMPID
>ADR_UNDL_SECID
>REL_INDEX
>PX_TRADE_LOT_SIZE
>PARENT_COMP_TICKER
>PARENT_COMP_NAME
>ID_LOCAL
>LONG_COMP_NAME
>PARENT_INDUSTRY_GROUP
>PARENT_INDUSTRY_SUBGROUP
>PARENT_INDUSTRY_SECTOR
>VOTING_RIGHTS
>ID_BB_PRIM_SECURITY_FLAG
>PAR_VAL_CRNCY
>EQY_SH_OUT
>EQY_SH_OUT_DT
>ID_BB_UNIQUE
>MARKET_SECTOR_DES
>IS_STK_MARGINABLE
>144A_FLAG
>TRANSFER_AGENT
>EQY_PRIM_SECURITY_TICKER
>EQY_PRIM_SECURITY_COMP_EXCH
>IS_SETS
>WHICH_JAPANESE_SECTION
>ADR_ADR_PER_SH
>EQY_PRIM_SECURITY_PRIM_EXCH
>EQY_FUND_CRNCY
>WHEN_ISSUED
>CDR_COUNTRY_CODE
>CDR_EXCH_CODE
>CNTRY_OF_INCORPORATION
>CNTRY_OF_DOMICILE
>SEC_RESTRICT
>EQY_SH_OUT_REAL
>ADR_UNDL_CRNCY
>MULTIPLE_SHARE
>PX_QUOTE_LOT_SIZE
>PX_ROUND_LOT_SIZE
>ID_SEDOL2
>SEDOL1_COUNTRY_ISO
>SEDOL2_COUNTRY_ISO
>ID_MIC_PRIM_EXCH
>ID_MIC_LOCAL_EXCH
>EQY_SH_OUT_TOT_MULT_SH
>SECURITY_TYP2
>ID_BB_PRIM_SECURITY
>EQY_OPT_AVAIL
>EQY_FREE_FLOAT_PCT
>END-OF-FIELDS
>
>TIMESTARTED=Tue Sep 26 17:33:28 EDT 2006
>START-OF-DATA
>AA US Equity|0|95|AA|US|ALCOA INC|US|USD|Common
>Stock|1.000000000|New York|UN|3334|ALUMINUM
>PROD|Mining|Metal-Aluminum|Basic
>Materials|2021805|850206|US0138171014|N.A.|N.A.|902130|094464|100046|1000|013817101|009988106|
>| |USD|.600|.600|.15|.15| |Regular
>Cash|Quarter|20061125|20061103|20060915|20000612|2 for
>1|20061101|N|14|US|ACTV| | | |SPX|1| | |N.A.|Alcoa Inc| | |
>|1.000|Y|USD|866.888|20060721|EQ0010004600001000|Equity|N.A.|N|EquiServe/First
>Chicago Trust Co Div|AA|US|N| | |UN|USD|N|US|EX|US|US|0|866888257|
>|N|1.00|100|N.A.|US|N.A.|XNYS| |866.888|Common Stock|1000|Y|99.6964|
>AALA US Equity|0|95|AALA|US|AMERALIA INC|US|USD|Common
>Stock|.010000000|OTC US|UV|N.A.| |Mining|Quarrying|Basic
>Materials|2023588|880772|US0235592062|N.A.|N.A.|N.A.|N.A.|101793|1000|023559206|N.A.|
>| | | | | | | | |None| | | |19930119|1 for 40| |N|18|US|ACTV| | |
>|SPX|1| | |N.A.|Ameralia Inc| | |
>|1.000|Y|USD|16.866|20040901|EQ0010179300001000|Equity|N.A.|N|Atlas
>Stock Trasfer, Inc.|AALA|US|N| | |UV| |N|US|EX|US|US|0|16866301|
>|N|1.00|100|N.A.|US|N.A.|XOTC| |16.866|Common Stock|1000|N|37.4822|
>CALS CN Equity|0|95|CALS|CN|ASPIRE CAPITAL INC|CA|CAD|Common
>Stock|N.A.|CNQ|CF|N.A.| |Investment Companies|Capital
>Pools|Financial|N.A.|N.A.|CA04537P1080|N.A.|N.A.|N.A.|N.A.|1185676|1001|04537P108|N.A.|
>|
>|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|702|CA|ACTV|
>| | |SPTSX|1| | |N.A.|Aspire Capital Inc| | | |.000|Y|
>|N.A.|N.A.|EQ0000000002830848|Equity|N.A.|N|Computershare Investor
>Services Inc.|CALS|CN|N| | |CF| |N|CA|CX|CA|CA|0|N.A.|
>|N|1.00|100|N.A.|N.A.|N.A.|XCNQ| |.000|Common Stock|1001|N|N.A.|
>805208Z CN Equity|0|95|805208Z|CN|CUDA CAPITAL CORP|CA|CAD|Common
>Stock|N.A.|Venture|CV|N.A.|
>|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|11105828|1000|N.A.|N.A.|
>|
>|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|N.A.|CA|PEND|
>| | |SPTSX|1| | |N.A.|Cuda Capital Corp| | | |.000|Y|
>|N.A.|N.A.|EQ0000000002831616|Equity|N.A.|N|N.A.|805208Z|CN|N| | |CV|
>|N|CA|CX|CA|CA|0|N.A.| |N|1.00|100|N.A.|N.A.|N.A.|XTSX| |.000|Common
>Stock|1000|N|N.A.|
>END-OF-DATA
>DATARECORDS=57129
>TIMEFINISHED=Tue Sep 26 17:51:45 EDT 2006
>END-OF-FILE
>
>------ file end above
>billious wrote:
>> "billious" <billious_1954@hotmail.com> wrote in message
>> news:4529c1ac$0$28985$a82e2bb9@reader.athenanews.com...
>> >
>> > "batman" <uspensky@gmail.com> wrote in message
>> > news:1160363613.855662.104440@b28g2000cwb.googlegroups.com...
>> >>i have a text file that is like:
>> >>
>> >> date = OCT0606
>> >> asdf
>> >> sdaf
>> >> asdfasdgsdgh
>> >> asdfsdfasdg
>> >> START-OF-DATA
>> >> asdfasdfg
>> >> asdfgdfgsfg
>> >> sadfsdfgsa
>> >> asdfgsdfg
>> >> END-OF-DATA
>> >> asdfgalsdkdfklmlkm
>> >> asdfgasdfg
>> >>
>> >>
>> >>
>> >> i need to clear everything from this file except the data between the
>> >> START-OF-DATA and END-OF-DATA using a batcj file... elternitavly i am
>> >> open to suggestions of how to import using bulk insert in sql without
>> >> changing the file at all. data is pipe seperated but obvioulsy has
>> >> plenty of junk data in it. i have 2 similar files at about 30mb and
>> >> 60mb in size. thnks everyone
>> >>
>> >
>> > Since you are posting from XP, do you want an XP solution?
>> > alt.msdos.batch.nt deals with NT-series, and alt.msdos.batch with DOS and
>> > 9x.
>> >
>> > Meanwhile, since pipe is a special character in DOS, can you be a little
>> > more explicit about the contents of your file? How is
>> > START-OF-DATA/END-OF-DATA detected (does it involve the presence of a pipe
>> > or is it contained in one or more data-fields?) Does any line start with a
>> > semicolon? Is there any non-alphameric content other than the pipes?
>>
>> NT4/2K/XP/2K3 (NT+ systems) are discussed in alt.msdos.batch.nt as the
>> techniques used differ markedly from DOS/9x methods.
>>
>> For instance, given data like:
>>
>> ----- data begins -------
>> a|b|c
>> d|e|f
>> 0|start|1
>> 2|3|4
>> 5|6|7
>> 8|end|9
>> g|h|i
>> j|k|l
>> ----- data ends -------
>>
>> in a file "psd.txt" then the following will produce "psdout.txt"
>>
>> ----- batch begins -------
>> [1]@echo off
>> [2]setlocal enabledelayedexpansion
>> [3]set yel=Y
>> [4]for /f %%i in (psd.txt) do call :process "%%i" &if !yel!==Y echo
>> %%i>>psdout.txt
>> [5]goto :eof
>> [6]
>> [7]:process
>> [8]if %yel%==L set yel=Y
>> [9]if %yel%==N for /f "tokens=2delims=|" %%j in (%1) do if /i %%j==end set
>> yel=L
>> [10]if %yel%==Y for /f "tokens=2delims=|" %%j in (%1) do if /i %%j==start
>> set yel=N
>> [11]goto :eof
>> ------ batch ends --------
>>
>> Lines start [number] - any lines not starting [number] have been wrapped
>> and should be rejoined. The [number] that starts the line should be removed
>>
>> The label :eof is defined in NT+ to be end-of-file but MUST be expressed
>> as :eof
>>
>> Without better knowledge of your file's content, refining this is a little
>> difficult.

awk "{if( $0 ~ /END-OF-DATA/) f=0; if( f ) print $0; if( $0 ~
/START-OF-DATA/) f=1}" infile > outfile

All one line - awk is gawk.exe, free and open source, from
<http://gnuwin32.sourceforge.net/packages/gawk.htm>. That copies the
lines *between* the markers, but not the markers themselves. It's
easily rearranged to include them: swap the first and third
statements.

--
T.E.D. (tdavis@gearbox.maem.umr.edu) Remove "gearbox.maem" to get real address - that one is dead

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация