learn linux - sed, sort, uniq, join, cut, paste, split

2010-12-31  来源:本站原创  分类:OS  人气:171 

learn linux - sed, sort, uniq, join, cut, paste, split
November 5
learn linux - sed, sort, uniq, join, cut, paste, split
============================== Sed =================== =======
1, call sed
Sed call in three ways: in the command line, type the command; the sed command to insert the script file, then call sed; to sed
Command into the script file and make sed script executable.
Using the sed command line format is:
sed [options] sed command input file.
Remember to use sed command at the command line, the actual command to add a single quote. sed also allows double quotes

Sed script file using the format:
sed [options]-f sed script input file

To use the first line with sed command sed interpreter script file, the format
script file sed [options] input file

sed options are as follows:
n Do not print; sed does not edit the line to write to standard output, the default is to print all lines (edited and unedited). p command can be used to print the editorial line.
c The next command is to edit the command. Use this option to join a number of editing. If you only use a sed command
This option is useless, but specifies that it does not matter.
f If you are calling sed script file, use this option. This option tells sed a sed script file to support all
Command, for example: sed-f myscript.sed input_file, where myscript. Sed sed command file is the support.
2, using sed query text in the document the way
sed Browse input file, the default starting from the first line, there are two ways to locate text:
1) Use the line numbers, can be a simple number, or a line number range.
2) Use regular expressions to use sed to locate the text in the document the way
xx is a line number, such as 1
x, y line number that ranges from x to y, such as 2,5 indicated on line 2 to line 5
/ Pattern / query the line containing the pattern. Such as / disk / or / [az] /
/ Pattern / pattern / query contains two model lines. Such as / disk / disks /
pattern /, x in a given query line number on the line containing the pattern. If / ribbon /, 3
x, / pattern / query by row number and pattern matching lines. 3. / Vdu /
x, y! query does not contain the specified line numbers x and y lines. 1, 2!
3, Basic sed editing commands
sed editing commands
p print the matching line
= Display file line number
a \ line number in positioning the new text after the additional information
i \ line number in positioning insert new text message
d delete line positioning
c \ text with new text to replace the positioning
s replaced with the corresponding mode Replace mode
r read a text from another file
Write text to a file w
q After the first launch of a pattern match or immediately released
l displays the ASCII code equivalent to the octal control characters
{} Command to execute the positioning line group
n to read a text from another file the next line, and add the next line
Paste the Mode 2 g / pattern n /
y transfer character
n continue to the next input line; allow cross the
4, sed and regular expressions
sed to identify any basic regular expression and pattern matching rules and lines
5, Basic sed programming examples using the p (rint) display line:
print order format [address [, address] P. Text line sed command line number must be provided.
sed-n '2 p 'quote.txt
Display from 2 to 4 lines
sed-n '2, 4p 'url_access_detail.txt

Print mode:
sed-n '/ 1028f / p' url_access_detail.txt
sed-n '/ \ / \? 1028f / p' url_access_detail.txt

Usage patterns and line number on the Internet:
Query only in line 4
sed-n '4, / \ / \? 1028f / 'p url_access_detail.txt

Show entire file:
Simply set the range line to the last line of the first row 1, $. $ Means the last line:
sed-n '1, $ p 'quote.txt

Any character:
Match any letter followed by any letter repeated 0 or more times, and end ing, model /. * Ing /. You can use this mode to check any word ending in ing
sed-n '/ .* ing /' p quote.txt

Print line numbers:
To print a line number, use the equal sign =. Print pattern matches the line number, use the format / pattern / =.
sed-e '/ music / =' quote.txt
(But they seemed to linux which is not supported)

Additional text:
sed-n '/ 1028f /' p url_access_detail.txt | sed '/ h / a \ then haha'
Added in front, with i \
Replaced by c \
Deleted, with the d \
6, the replacement text
Replace Replace Replace mode with the specified command mode, the format is:
[Address [, address]] s / pattern-to-find / replacement-pattern / [gpwn]
s option tells sed this is a replacement operation, and check pattern - to - find, after the success with a replacement - pattern to replace it.
g replaced by default only the first occurrence of pattern, use the g option to replace all occurrences of the global model.
p sed all be replaced with the default line written to standard output, plus p option causes - n option is invalid. - N option does not print the output.
w file name to use this option to output to a file.
sed 's / haha / hehe / gw a.out' quote.txt

Modify the string using Replace:
If you want to add or modify a string, you can use the (&) commands, & command to re-save mode call it that, then put it inside the replacement string. Here are a modified design ideas. First give a model to be replaced,
Then the first one ready to add another model after model, and followed by a &, so modify the pattern will be placed before the matching pattern. For example, sed statement s / nurse / "Hello" & / p of the results are as follows
sed 's / nurse / "Hello" & / p' quote.txt
Such as the original sentence is: The nurse come from china.
Sentence is replaced: The "Hello" nurse come from china.
7, read from the file text processing files, sed allowed to read a text from another file, and the text attached to the current file. This command line on the pattern matching, the format is: address r filename
sed '/ companty / r append.txt' quote.txt
8, display control characters in the file
sed format:
[Address, [address]] l
sed '1, $ 1 'quote.txt
(Linux is not supported)
9, processing the output message has the following output example:
Database Size DateCreated
newlog 2289 12/11/2005
mysql 1909 09/12/2005

(2 row affected)

In order to use the output for further automatic processing, need to know the name of the database stored, for this to be the following:
1) The use of s / - * / / g remove the dash - - - - - -.
2) Use the / ^ $ / d delete empty lines.
3) Use the $ d delete last line
4) the use of 1 d to delete the first line.
5) Use awk {print $ 1} prints the first column.
Command as follows, where the use of a cat, and piped the results to the sed command the last command is as follows:
cat sql.txt | sed 's /-*// g' | sed'/^$// g '| sed' $ d '| sed '1 d' | awk '{print $ 1}'

Removal of line numbers:
sed 's / ^ [0-9] / / g' data.txt

10, a number of commonly used treatment 's / \. $ / / G' delete the period at the end of the line '-e / abcd / d' to delete contains the abcd of the line 's / [] [] [] * / [] / g' delete a more space, use a space instead of 's / ^ [] [] * / / g' delete the first row spaces 's / \. [] [] * / [] / g' delete period followed by two or more empty cells and replace them with a space '/ ^ $ / d' to delete blank lines 's / ^. / / g' delete the first character 's / CO L \ (... \) / / g' delete followed by COL's After the three letters 's / ^ \ / / / g' from the path to remove the first \
'S / [] / [] / / g' remove all spaces and use the tab key instead of 'S / ^ [] / / g' delete the first row all the tab key 's / [] * / / g' delete all tab key
================================ =============== Merge and split ===========
sort uniq join cut paste split
Usage ===================== ================ sort
sort command in many different domains under different column order classification.
1, sort options
The general format for the sort command:
sort-cmu-o output_file [other options] + pos1 + pos2 input_files
Below is a brief introduction sort of parameters:
-C test whether a file has been classified.
-M merge the two categories file.
-U delete all copies of the line.
-O sort the results stored in the output file name.
Other options are:
-B using the domain classification, ignore the first space.
-N specifies the number on the classification of domain classification.
-T field separator; with non space or tab-delimited fields.
-R compared to the classification of the order or the inverse.
+ Nn number for the domain. No start using the domain classification.
nn is the domain number. Ignore this comparison in the classification domain, generally used in conjunction with the + n.
post1 passed to the m, n. m for the field number, n the number of characters to begin classification; such as 4,6 to 5 domain classification means, the first 7 characters from the beginning.
2, sort startup mode by default, sort of a space or a space that is delimited. Separated by adding other ways to use - t option.
sort implementation, first see if the field separator set - t option, if set then use it to log into the domain separate 0, domain 1 and so on;
If not set, use a space instead. Sort by default sort the entire line, except in the case specified domain number.
An important fact about the sort is that it refers to the first domain as a domain 0, domain 1 is the second domain, and so on.
3, whether the documents have been classified
sort-c data.txt
4, sort classification inverse sort if you want to reverse the results, use the-r option.
sort-t:-r video.txt
5, according to the specified domain classification sometimes only on the 2 domain (classification key 1) classification.
sort-t: +1 video.txt
6, numerical domain classification with-n option. Must be, or not get the desired results.
sort-t: +3 n video.txt
7, the only classification used - u option unique (not repeated) classification to remove duplicate rows.
8, the other sort methods using k
There are other ways sort keys specified category. K option can be specified.
sort-t:-k4 video.txt

Classify using the k key sort:
Can specify the sort key order. First to the 4th field, then 1 domain classification, command-k4-k1
sort-t:-k4-k1 video.txt
9, specify the sort sequence to specify the sort key order, you can use - n option to specify which category do not use keys to search. Look at the following sort order:
sort +0 -2 +3
This means that the command began to domain 0 classification, ignore the domain 2, and then use the domain 3 classification.
10, pos usage of the domain specified starting position classification Another way is to use the following format:
sort + filed.characterin
This means that from the beginning filed classification, but the first in this domain characterin character start.
11, using the classification of head and tail can be used to output access to any head or tail large text files
head -200 filename
12, awk output using the sort
13 classified documents merge the two files before the merger, they must have been classified.
Use-m +0. This paper the classification of documents into the existing video. Sort, classify the domain name should, in fact, no need to add + o, but to be safe, or with the good.
sort-t:-m +0 video2.txt video.sort

Usage ==================== =================== uniq
uniq is used to remove from a text file or prohibit duplicate rows. Uniq file is classified generally assumed, and the result is correct.
The only sort of option to remove all duplicate rows, and uniq command does not do so. What is the duplicate rows? This means continuing in uniq repeated in rows, the middle is not mixed with any other text.
Command general format:
uniq-udc-f inputfile outputfile
The options mean:
-U displays only do not repeat the line.
-D show only duplicate rows, each row shows only one line of repeating
-C Print the number of occurrences of each repeated line.
-Fn to digital, the first n domain is ignored.
Some systems do not recognize - f option, then replace the use of - n.
Test for a particular domain:
Use - n only part of the uniqueness of the test line. For example - 5 means that the test of the field after the first 5 unique domains. Domain counting from 1.
If you ignore 1 domain, only the uniqueness of the test field 2, the use of - n 2, the following file contains a set of data, the first of 2 field representative group code.
uniq-f2 parts.txt or
uniq-n2 parts.txt
================== Join usage (strong Yeah, sql inner join, as it )=================== =====
classification is used to join two text files from the line together.
Join work described below. There are two files file 1 and file 2, of course, have been classified. Each file has a number of elements associated with another file. Because of this relationship, join the two files together, this is similar to modify a master file, to include two files in the common elements.
For the effective use of join, to be classified, respectively, the input file:
join in the format:
join [option] file1, file2
an n is a number, n used to connect to display from the file does not match the line. For example,-a1 shows the file does not match the first line, - a2 is shown from the second file does not match the line.
o nm n for the file number, m is the domain number. 1.3 indicates that only displays the file a third domain, each n, m must be separated by a comma,
If 1.3,2.1.
jnmn for the file number, m is the domain number. Make connections with other domains domain
t field separator. Used to set the non-space or tab-delimited fields. For example, specifying the domain to do a colon separator - t:

Connect to the domain as a domain 0, delete or remove the default connection key to join the second repeated:
join names.txt town.txt
1, does not match the connection, the following example shows the match and does not match the domain
join-a1-a2 names.txt town.txt
Show only the first file does not match the line:
join-a1 names.txt town.txt
2, selective connection with - o option to select the connection domain. For example, to create a file that contains only names and towns, join implementation need to specify the display field. As follows:
Use 1.1 shows the first file the first field, 2.2 second file shows a second domain, during which separated by commas. Command:
join-o 1.1,2.2 names.txt town.txt
Use-jn m connected to other domains, such as with file 1 and file 2 domain 3 domain 2 do connect button, the command is:
join-j1 3-j2 2 names.txt town.txt

===================== =========================== Cut usage ====================
cut to a text file from standard input or the shear line or domain. Cut the text can be pasted into a text file.
cut general format is:
cut [options] file1 file2
Here are the available options:
-C list Cut characters specified.
-F field specifies the number of clipping region.
-D Specifies the tab key and space and different field separator.
Cut-c is used to specify the range as follows:
Cut 1-c 1,5-7 characters, and then 5 to 7 characters.
Cut-c1-50 before the 50 characters.
-F format - c the same.
Cut-f 1,5 1 domain, 5 domains.
- F 1,10-12 cut 1 domain, the first 10 domains to 12 domains.

Cut the specified domain:
cut command cut the required fields separated by commas, such as shear fields 1 and 3, you can use:
cut-d:-f1, 3 pers

Usage ======================= ========================= paste ===================
cut to the standard output from a text file or to extract data columns or fields, and then paste the data can be pasted together to form the relevant documents. Paste the data from two different sources, the first need to classify, and to ensure that the number of lines the same two files.
line will be about to paste the information on different file line. By default, paste connection, with a space or tab-delimited text in new line different, unless you specify the - d option, it will be the domain separator.
paste format;
paste-d-s-file1 file2
Options have the following meanings:
-D Specifies the key is different from the space or tab field separator. For example, separated with @ domain, using the - d @.
-S will make the trip each file instead of merging the line glued

paste command pipeline type:
paste command there is a very useful option (-). This means that for every (-), reads from standard input data.
Spaces as field separator, a 4-column format to display a directory listing. As follows:
ls | paste-d ":" - - - -
Usage ======================= ========================= split ===================
used to split large files into smaller files.

split-output_file-size input-filename output-filename
Each file format is x [aa] to x [zz], x is the file name the first letter, [aa], [zz] for the rest of the order of characters in the file name combination.
split -5000 url_access_user.txt splitfile

  • learn linux - sed, sort, uniq, join, cut, paste, split 2010-12-31

    learn linux - sed, sort, uniq, join, cut, paste, split November 5 learn linux - sed, sort, uniq, join, cut, paste, split ============================== Sed =================== ======= 1, call sed Sed call in three ways: in the command line, type the

  • 基础11:文件分类.合并和分割(sort,uniq,join,cut,paste,split) 2010-06-08

    QUOTE: " 实用的分类(s o r t)操作. " uniq. " join. " cut. " paste. " split. sort用法 s o r t命令选项很长,下面仅介绍各种选项. 选项 s o r t命令的一般格式为: sort -cmu -o output_file [other options] +pos1 +pos2 input_files 下面简要介绍一下s o r t的参数: QUOTE: -c 测试文件是否已经分类

  • linux sort,uniq,cut,wc命令详解 2014-05-19

    sort sort 命令对 File 参数指定的文件中的行排序,并将结果写到标准输出.如果 File 参数指定多个文件,那么 sort 命令将这些文件连接起来,并当作一个文件进行排序. sort语法 [[email protected] ~]# sort [-fbMnrtuk] [file or stdin] 选项与参数: -f :忽略大小写的差异,例如 A 与 a 视为编码相同: -b :忽略最前面的空格符部分: -M :以月份的名字来排序,例如 JAN, DEC 等等的排序方法: -n :使用『纯数字』进行排

  • I used Linux command of the join - merging data files by keyword 2010-10-28

    I used Linux command of the join - merging data files by keyword This link: http://codingstandards.javaeye.com/blog/796299 (reprint, please indicate the source) Use Description Linux, the most common data file format is text format, a number of field

  • Learn linux / unix programming methodology, learning the four steps in Linux 2010-08-05

    Suppose you are a computer technical education, computer science courses such as basic data structures, operating systems, architecture, compiler theory, computer networks, like me you are completely repair can be divided into four stages from low to

  • [Reserved] recommended sites to learn linux script 2011-08-11

    Recommended sites to learn linux script http://linux.vbird.org/linux_basic/0320bash.php

  • Table of connections: NESTED LOOP.HASH JOIN.SORT MERGE JOIN (rpm) 2011-04-29

    Original: http://hi.baidu.com/fancy% 5Fwly/blog/item/07b0092ad7b73f3c5243c1b4.html http://blog.csdn.net/chekeyang/archive/2010/12/15/6077397.aspx Table connection and use the occasion NESTED LOOP For a subset of the connected data is relatively small

  • Learn linux 2012-04-10

    I have to learn linux

  • sort/uniq用法详解. 2013-04-25

    在工作中经常用到sort命令,sort和uniq命令一起使用,来排序文档.这里详细的介绍sort/uniq的用法. 1, -u/--unique 起到uniq命令的作用,删除重复行,只留一个. sort -u test 2, -r/--reverse 降序排列. sort -r test 3, -o/--output=FILE, 重定向.sort输出默认是标准输出.重定向时,我们可以用>,>>重定向排序后的文件.但是,如果要重定向到原文件,则>,>>不能胜任了,这时候需

  • linux's cut, paste join command 2010-08-13

    linux cut command can be used to document the development of the output column.

  • Some advice to learn linux 2010-06-20

    1. Not when the "missionary" Many people in the discussion area caused by constant "Linux vs. Windows" like to discuss or even dispute the red in the face, it is not necessary. This debate is a waste of time without any help. Yes, you

  • [Linux]sed命令用法详解 2011-11-10

    http://witmax.cn/linux-sed-usage.html http://www.linuxsir.org/bbs/showthread.php?t=189620 Sed学习笔记 1. sed简介 sed 是一种在线编辑器,它一次处理一行内容.处理时,把当前处理的行存储在临时缓冲区中,称为"模式空间"(pattern space),接着用sed命令处理缓冲区中的内容,处理完成后,把缓冲区的内容送往屏幕.接着处理下一行,这样不断重复,直到文件末尾.文件内容并没有 改变,除

  • Learn linux / unix programming recommendations 2010-04-16

    First of all, learn first editor, vim, emacs what do. Then learn make file documents, as long as that point on the line, so you are ready to compile proceedings. Then take a look at "C Programming Language" K & R, so it basically can be prog

  • Bird Brother linux private kitchens (Basics - 2. How to learn linux) Notes 2010-05-13

    1, X Window System In order to enhance the utilization rate of desktop computers, Linux and X Window System combines! Like the previous chapter, the note inside, should be noted that, X Window System is only just above a Linux software, rather than t

  • Bird Brother linux private kitchens (Basics - 2. How to learn linux) Note 2 2010-05-13

    1, then how to cultivate an interest and a sense of achievement then? There are several directions can be provided for your reference: • the establishment of interests: Linux can play above what is really too much, you can choose an interesting topic

  • Understanding linux sed command 2010-10-05

    1. Sed Introduction sed is an online editor that the contents of a processing line. Processing, the current treatment of the line is stored in the temporary buffer, known as the "pattern space" (pattern space), then use the sed command to deal w

  • Outline Introduction to learn Linux 2010-10-10

    linux command must be familiar with the find.sed, awk to be more familiar with, will be regular. Understanding of unix programming, compiling, that many well-known under the gnu software and will be used. example: m4, automake, autoconfig, binutils,

  • Detailed linux sed command 2010-10-22

    1. Sed Introduction sed is an online editor that the contents of a processing line. Processing, the current treatment of the line is stored in the temporary buffer, known as the "pattern space" (pattern space), then use the sed command to deal w

  • Introduction to Linux Sed 2010-12-02

    1. Sed Introduction sed is an online editor that the contents of a processing line. Processing, the current treatment of the line is stored in the temporary buffer, known as the "pattern space" (pattern space), then use the sed command to deal w

  • linux sed string replacement bulk 2011-01-14

    For example, to the directory / modules zhangsan all of the following files are modified lisi, do this: sed-i "s / zhangsan / lisi / g" `grep zhangsan-rl / modules` Explain: -I said the inplace edit, modify files in place -R indicates that the s