具体详情可以点击传送门查看:【誉天红帽入门指南】第七期:shell基础。
本期我们继续为大家带来 RHEL 8.0 的知识分享——字符处理。
在Windows中我们可以通过文本文档和office等工具来完成字处理操作,那么在Linux中我们该如何来处理文本,通过本章的学习我们将掌握在Linux中如来使用工具提取、分析和操作文本数据。
1. 使用cat、more、less查看文本内容
cat:打印一个或者多个文件到标准输出
more:浏览文件内容,每次只看一页
less:浏览文件内容,每次只看一页,man命令就是使用less来分页。
[ root@localhost tmp]# ls
file. txt
[ root@localhost tmp]# cat file. txt
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
[root@localhost tmp]#
[ root@localhost tmp]# more file. txt
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
--More-- ( 12% )
[ root@localhost tmp]#
[ root@localhost tmp]# less file. txt
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
Red Hat Certified Engineer
file. txt
2.使用head和tail摘选文本
head:显示文件的起始10行,使用-n选项指定显示的行;
tail:显示文件最后10行,使用-n选项指定显示的行,使用-f选项将文件末尾追加的内容显示在当前终端对于监控日志文件非常有用!
[ root@localhost tmp]# cat file1.txt
1 Red Hat Certified Engineer
2 Red Hat Certified Engineer
3 Red Hat Certified Engineer
4 Red Hat Certified Engineer
5 Red Hat Certified Engineer
6 Red Hat Certified Engineer
7 Red Hat Certified Engineer
8 Red Hat Certified Engineer
9 Red Hat Certified Engineer
10 Red Hat Certified Engineer
11 Red Hat Certified Engineer
12 Red Hat Certified Engineer
13 Red Hat Certified Engineer
14 Red Hat Certified Engineer
15 Red Hat Certified Engineer
16 Red Hat Certified Engineer
17 Red Hat Certified Engineer
18 Red Hat Certified Engineer
19 Red Hat Certified Engineer
20 Red Hat Certified Engineer
[ root@localhost tmp]# head file1. txt
1 Red Hat Certified Engineer
2 Red Hat Certified Engineer
3 Red Hat Certified Engineer
4 Red Hat Certified Engineer
5 Red Hat Certified Engineer
6 Red Hat Certified Engineer
7 Red Hat Certified Engineer
8 Red Hat Certified Engineer
9 Red Hat Certified Engineer
10 Red Hat Certified Engineer
[ root@localhost tmp]# tail file1. txt
11 Red Hat Certified Engineer
12 Red Hat Certified Engineer
13 Red Hat Certified Engineer
14 Red Hat Certified Engineer
15 Red Hat Certified Engineer
16 Red Hat Certified Engineer
17 Red Hat Certified Engineer
18 Red Hat Certified Engineer
19 Red Hat Certified Engineer
20 Red Hat Certified Engineer
[ root@localhost tmp]#
[ root@localhost tmp]# tail -f filel. txt
11 Red Hat Certified Engineer
12 Red Hat Certified Engineer
13 Red Hat Certified Engineer
14 Red Hat Certified Engineer
15 Red Hat Certified Engineer
16 Red Hat Certified Engineer
17 Red Hat Certified Engineer
18 Red Hat Certified Engineer
19 Red Hat Certified Engineer
20 Red Hat Certified Engineer
11111
[ root@localhost tmp]# echo 11111 >> filel. txt
3. 使用grep按关键字提取文本
grep打印匹配的文件行或者标准输入
$ grep ‘root’ /etc/passw
常用选项:
-i:忽略大小写敏感搜索
-n:打印匹配的行号
-o :只显示匹配的内容
[ root@localhost tmp]# grep ‘root' /etc/passwd
root:x:0:0: root: / root: /bin/bash .
operator :x: 11: 0:operator:/root: /sbin/no Login
[ root@loca Thost tmp]#
[ root@localhost tmp]# grep -i root /etc/passwd
root:x:0:0:root: /root: /bin/bash
operator :x:11: 0: operator:/root:/sbin/nologin
Root
[ root@localhost tmp ]#
[ root@localhost tmp]# grep -n root /etc/ passwd
1: root:x:0:0:root: /root: /bin/bash
10:operator :x:11: 0:operator:/root:/sbin/nologin .
[ root@localhost tmp]# grep -0 root /etc/pas swd
root
root
root
root
[ root@localhost tmp ]#
4.使用cut提取列或者字段
显示文件指定的列或者标准输入数据
$ cut -d:-f1 /etc/passwd
$ grep root /etc/passwd |cut -d:-f7
使用-d选项来指定列分隔符
使用-f选项来指定要打印的列
[ root@localhost tmp]# cut -d : -f1 /etc/ passwd
root
bin
daemon
adm
lp
sync
shutdown
halt
operator
games
ftp
nobody
dbus
systemd- coredump
systemd- resolve
tss
polkitd
geoclue
[ root@localhost tmp]# grep root /etc/passwd| cut -d: -f7
/bin/bash /
sbin/no login
[ root@localhost tmp] #
1. 文本统计wc
Wc 计算单词数,行数,字节数和字符数
Wc –l 查看行数 wc –w 统计单词数 wc –c 统计字节数
[ root@localhost tmp]# WC -1 filel. txt
21 filel. tx
t[ root@localhost tmp]# WC -C file1. txt
686 filel. txt
[ root@localhost tmp]# WC -W file1. txt
101 filel. txt
[ root@localhost tmp]# WC file1. txt
21 101 686 filel. txt
2.文本排序sort
Sort对标准输出排序 - 原始文件不改变
Sort –r 排反序
[ root@localhost tmp]# cat 1. txt
d
g
9
4
6
2
[ root@localhost tmp]# sort 1. txt
2
4
5
6
9
d
f
9
[ root@localhost tmp]# sort -r 1. txt
g
f
d
9
6
5
4
2
[ root@localhost tmp]#
3. 文本去重uniq
uniq:从相邻的行中删除重复行
[ root@localhost tmp]# cat 1. txt
1111111111111111
1111111111111111
222222222222222
222222222222222
[ root@localhost tmp]# uniq 1. txt
1111111111111111
222222222222222
[ root@localhost tmp]#
4. 文本比较diff
diff命令用于比较文件的内容,特别是比较两个版本不同的文件以找到改动的地方。diff会在命令行中打印每一个行的改动,比较是针对文件内容与文件名无关。
”<”和”>”分别用于表示diff命令后面的第一个文件和第二个文件的内容。
“1,4c1,2”表示第一个文件的第1到4行有改动(a 添加、d 删除、c 修改),改动后为第二个文件中的第1到2行。
“------- ”是分隔符,上面指对比的第一个文件,下面指对比的第二个文件。
[ root@localhost tmp]# diff 1.txt
2. txt
1, 4c1,2
<1111111111111111
<1111111111111111
< 222222222222222
< 22222222222222 2
>11111111111111
11111111111111
[ root@localhost tmp]#
1. Tr文本操作工具
tr更改(转变)字符,转换一种字符集合为另外一种字符集合,只能从STDIN读取数据。
[ root@localhost tmp]# tr '1-9' 'A-Z’< 1. txt
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBB
BBBBBBBBBBBBBBB
[ root@localhost tmp]#
2.Sed文本操作工具
sed 流编辑器,常用sed 替换文本内容
比如将数字1替换为A
[ root@localhost tmp]# cat 1. txt
1111111111111111
1111111111111111
222222222222222
222222222222222
[ root@localhost tmp]# sed "s/1/a/g" 1. txt
aaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaa
222222222222222
222222222222222
[ root@localhost tmp]#
使用sed –I 替换并保存文本
[ root@localhost tmp]# sed -i
'"s/1/a/g" 1. txt
[ root@localhost tmp]# cat 1. txt
aaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaa
222222222222222
222222222222222
[ root@localhost tmp ]#
示例:
sed -n 5p passwd 打印第5行
sed -n $p passwd 打印最后一行
sed -n ‘1,5p’ passwd 打印第1-5行
sed -n '/root/Ip' passwd 匹配带有root关键词的行,并忽略大小写
sed -n '\%root%Ip' passwd 同上
sed '1,5d' passwd 删除第1-5行
sed -i.bak '1,5d' passwd 删除第1-5行,源文件被修改
sed '2a \abc' passwd 在文件第2行下面追加abc
sed 's/north/hello/' datafile --替换每行第一个north
sed 's/north/hello/g' datafile --全部替换
sed '1 s/north/hello/g' datafile --替换第一行所有的north
sed '1 s/north/hello/' datafile --替换第一行第一个north
sed '1 s/north/hello/2' datafile --只替换第一行第二个north
巧用替换删除内容(不是删除行)
sed 's/north//' datafile --删除所有行的第一个north
sed 's/north//g' datafile --删除全部的north
sed '1 s/north//2' datafile --删除第一行第二个
sed 's/^/#/' datafile 给每行开始加注释,^就代表开始
sed 's/^.//' datafile --删除每行第一个字母
sed 's/^\(..\)./\1/' datafile --删除第3个字母 注释:\1代表第一个括号匹配到的内容
sed 's/^\