Knowledge 思维导图模板_ProcessOn思维导图、流程图

linux

概念

文件系统

硬链接/软链接

硬链接: 直接链接到inode上 (故删除原文件并不会导致链接失效)

软链接: 用目录作链接 (故原文件不在对应目录下则链接失效)

inode(索引节点): Linux 文件系统通过把 inode 节点和文件名进行连. 需要读取该文件时，文件系统在当前目录表中查找该文件名对应的项，由此得到该文件相对应的 inode 节点号. 通过该 inode 节点的磁盘地址表把分散存放的文件物理块连接成文件的逻辑结构。

安全

CC攻击: 多个不同用户不停访问

DDOS攻击(分布式拒绝服务攻击):  多个计算机联合, 通过大量合法的请求占用大量网络资源

<strike>nginx_waf防止SQL注入:  应用程序存在安全隐患. 用户可以提交一段数据库查询代码. 根据程序返回的结果, 获得某些他想得知的数据. </strike>

shell

常用命令

文件管理

文件处理

cp

mv

rm

查找

find: find pathname -options [-print -exec -ok ...]

locate: 类似find, 通过内建文档数据库快速查找 updatedb用以更新

which: 查看可执行文件的位置(搜PATH)

whereis:  whereis 查看文件的位置

文件内容查看

head

tail: 用于显示指定文件末尾内容. 不指定文件时, 作为输入信息进行处理. 常用查看日志文件.

more

less

grep: grep [option] pattern file|dir

wc

sed: 文件编辑命令

ln: 建立链接 ln -sv source.log link.log

chmod: 文件类型/文件所有者/用户组/其他 _/rwx/rwx/rwx

chown: chown <所有者>:<用户组> <文件名>

磁盘命令

cd

df: 磁盘使用情况

du: 查看文件和目录的使用空间

ls

mkdir: 创建文件夹 -p创建整个路径 rmdir: 删除文件夹

pwd: 当前工作路径

读取用户输入: read <变量名>

用户/用户组管理

添加组: groupadd <组名>

添加用户: useradd -g <组名> <用户名>

删除用户: userdel -r <用户名>

网络命令

ifconfig: 查看网络接口(-a表示所有) 使用 up 和 down 命令启动或停止某个接口: ifconfig eth0 up 和 ifconfig eth0 down

netstat: 显示网络状态 netstat [-acCeFghilMnNoprstuvVwx][-A<网络类型>][--ip]

ping: 检测主机

telnet: 远端登录 telnet 192.168.0.5

系统管理命令

date: 时间

free: 内存使用情况

kill: 发送指定的信号到相应进程. 不指定型号将发送SIGTERM（15）终止指定进程.

ps: 进程状态

pstree: 进程树

top: 系统状态 包括进程 ID、内存占用率、CPU 占用率等

定时执行

crontab

压缩/解压缩, 打包/解包

压缩/解压

bzip2: 创建 *.bz2 压缩文件：bzip2 test.txt 。 解压 *.bz2 文件：bzip2 -d test.txt.bz2 。

gzip: 创建一个 *.gz 的压缩文件：gzip test.txt 解压 *.gz 文件：gzip -d test.txt.gz 显示压缩的比率：gzip -l *.gz

uzip

打包解包

tar

程序/编程

ulimit -a: 数据段大小之类的

参数/变量

$0: 脚本本身文件名称 $1～$n: 各参数值 $*: 所有参数列表("1 2 3") $@: 所有参数列表("1" "2" "3") $#: 参数个数 $$: 脚本运行时的PID $! : Shell最后运行的后台Process的PID $?: 脚本退出码

取消变量: unset <参数名>

列表/数组

调用seq: $(seq 1 30)

直接空格隔开: A B C

用括号: a=(Shirley 2 Hank) 遍历时可用${a[*]}; 访问特定索引可用: ${a[1]} 对特定索引赋值: a[1]=loves 获取长度: ${#variable[*]}

${}操作

数字运算: <ul><li>$[16 + 4]</li><li>expr 16 + 4</li></ul>

语句

比较运算符:  <ul><li>==</li><li>-gt</li><li>-lt</li></ul>

if语句: if [ $x -gt $y ] then     命令 else     if [ 条件 ] then         命令     else         命令     fi fi

循环

for循环: for 变量 in 循环列表 do     命令 done

也可以这样: for((i=1;i<51;i++))

while循环: while [ 条件 ] do     命令 done

函数

gcc

-E: 只预处理(宏, 头文件等)

-c: 直接生成.o文件

系统调用

进程

fork()

Shared Memory(POSIX)

int shm_open(const char *name, int oflag, mode_t mode) // open a shared memory

int ftruncate(int fd, off_t length) // Configure the size

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) // mmap. Here we set flags as MAP_SHARED

int shm_unlink(const char *name) // unlink and remove

管道

Anonymous Pipe

int pipe(int pipefd[2]) // pipefd[0] is the read end // pipefd[1] is the write end // This pipe is anonymous, // using fork() to share fds

Named Pipe

int mkfifo(const char *pathname, mode_t mode) // Named pipe

int mkfifoat(int dirfd, const char *pathname, mode_t mode) //

int mkfifo(const char *pathname, mode_t mode) // Named pipe

内存管理

brk() // 在malloc()时要用到 // grow and shrink memory area

进程调度

CFS 通过虚拟运行时间vruntime(与基于优先级的衰减因子有关) 调度程序选择具有最小vruntime值的任务运行 这里用红黑树维护任务

友好值(nice value), -20~+19

目标延迟(target latency), 即每个可运行任务应当运行一次的时间间隔

并行编程

OpenMP

基本操作

并行块

#pragma omp parallel // 并行执行后续程序 // By default, there is a barrier at the end.

Clauses

#pragma omp parallel num_threads(N) // N个线程并行执行后续程序

#pragma omp parallel shared(A,B,C) private(id) // specify the shared and private variables

同步

#pragma omp barrier // Wait for all threads reaching here

#pragma omp critical // The following codes can be accessed by only ont thread at one time

#pragma omp atomic // Like critical, // but for simple scenarios(like increments), // it will use special hardware constructs if available

for循环并行

#pragma omp for

Clauses

#pramga omp for reduction(op:list) // The elements in list will be identity of op // Elements are copied as local variable // and finally be reducted

#pramga omp for nowait // Generally the threads in for parallel // won't exit until all threads are done. // If waiting is unnecessary, use nowait.

work sharing

master结构

#pragma omp master // Only the master will execute codes in this structure.

single结构

#pragma omp single // Only 1 thread (the first one gets here) // will do the codes

Clauses

#pragma omp single nowait

section结构

#pragma omp sections {     #pragma omp section         do_x();      #pragma omp section          do_y(); }

锁 omp_lock_t

omp_init_lock()

omp_set_lock()

omp_unset_lock()

omp_destroy_lock()

omp_test_lock()

Runtime Library

threads numbers

omp_set_num_threads()

omp_get_num_threads() // You won't get the threads number // unless you've entered the parallel region

omp_get_thread_num()

omp_get_max_threads()

omp_in_parallel() // To check if running in parallel region

dynamic mode: Threads given may differ

omp_set_dynamic(int _Dynamic_threads)

omp_get_dynamic()

omp_num_procs() // Get how many processors available

Environment Variables

OMP_NUM_THREADS By default, how many threads should I use

OMP_STACKSIZE Tell the system the stacksize needed

OMP_WAIT_POLICY ACTIVE | PASSIVE <ul><li>ACTIVE: actively spin till available(cost almost nothing)</li><li>PASSIVE: put it sleep(cost a lot)</li></ul>

OMP_PROC_BIND TRUE | FALSE Threads once binded to a processor, leave there

Data Environment

pivate // This clause won't initialize the variable

firstprivate // Will be initialized as the global value

lastprivate // The value will be copied to the global variable

shared

Tasks

#pragma omp task // Create a task to execute following codes

#pragma omp taskwait

Memory Model

Week Consistency S>>W, S>>R, R>>S, W>>S, S>>S

#pragma omp flush(list) // Rs and Ws overlaping after the flush // don't executed until after the flush.

implicit flushes <ol><li>entry/exit of parallel region</li><li>implicit/explicit barrier</li><li>entry/exit of critical section</li><li>a lock is set/unset</li></ol>

#pragma omp atomic [read|write|update|capture]

Thread Private

#pragma omp threadprivate(xxx) // Each thread will have their // own(identified by thread ID) xxx copy. // And it's global, which means it is // alivethrough all parallel blocks

copyin(xxx) // copy in the main thread's xxx

书

Using OpenMP

Patterns For Parallel Programming by Tim Mattson

Concurrency in Programming Languages (About the histroy)

The Art of Concurrency

pthread(POSIX标准)

pthread_create()

pthread_exit() // 可以给出返回值

pthread_join()

pthread_kill()

结构体为: pthread_t

pthread_cancel() pthread_testcancel()

模式(Patterns)

SIMD(Single Instruction Multiple Data)

for parallel

Divide and Conquer Pattern: Split problem, and solve the subproblem. Recursively do above.

编程语言

Python

re

re.match(pattern, string, flags=0) # 尝试从字符串的起始位置匹配

.group(num) .groups() .span(num) # 返回索引

re.search(pattern, string, flags=0) # 返回第一个成功的匹配。

re.sub(pattern, repl, string, count=0, flags=0) # 替换

re.findall(string[, pos[, endpos]]) # 返回所有匹配子串列表

re.finditer(pattern, string, flags=0) # 类似findall, 但返回迭代器

re.split(pattern, string[, maxsplit=0, flags=0]) 按照能够匹配的子串将字符串分割后返回列表

Consistency (about R(read), W(write), S(sync)

Sequential Consistency Program Order = Code Order = Commit Order

Relaxed Consistency

Algorithm

关联性

Apriori系

FP Tree

FP Tree算法原理总结

从零实现机器学习算法（十四）FP-growth

近邻算法

KNN

CNN

强化学习

Markov 决策过程

Q-Learning

SARSA

一致性哈希(Consistent Hashing)

K-V Store

Data Structures

Log-Structured Merge Tree (LSM-tree)

Advantages: high write efficiency low costs as a tiered storage

Disadvantages: slow compactions(much I/O)

background compaction operations(a.k.a merge)

Paper

fast20: FPGA-Accelerated Compactions for LSM-based Key-Value Store

数据中心

Fat Tree

All Reduce

Parameter Server 有一个服务器来弄

Collective AllReduce

Halving-Doubling

ring

double binary tree

Hierachical Aggregation

由于数据中心的特点, 一般两层

Math

概率论与数理统计

概率论

常见分布

离散

二项分布

Poisson分布

均匀分布

连续

正态分布

指数分布

卡方分布

t分布

数字特征

均值

方差

协方差/协方差矩阵