Show Menu
Cheatography

zawk = AWK + stdlib + Rust

Install zawk

$ cargo install zawk
$ brew install linux-­chi­na/­tap­/zawk
$ sudo xattr -r -d com.ap­ple.qu­ara­ntine $(readlink -f $(brew --prefix zawk))­/bi­n/zawk

how to run zawk?

$ zawk 'BEGIN { print strfti­me(), whoami() }'
$ zawk -f demo.awk demo.txt

String functions: utf-8 by default

length(s)
Length of s: utf-8 format
chat_a­t($1, 1)
Get char at index, starts from 1
match(s, re)
if string s matches the regular expression
substr(s, i[, j])
1-indexed substring of string s
sub(re, t, s)
Substi­tutes t for the first matching occurrence of regular
gsub(re, t, s)
Like sub, but with all occurr­ences substi­tuted, not just the first.
index(s,t)
Position in string s where string t occurs, 0 if not found
last_i­nde­x(s,t)
Last position in string s where string t occurs, 0 if not found
split(s, m[, fs])
Splits the string s according to fs, placing the results in the array m.
sprint­f(fmt, s, ...)
Returns a string formatted according to fmt and provided arguments
printf­(fmt, s, ...) [>[­>] out]
Like sprintf but the result of the operation is written to standard output
hex(s)
Returns the hexade­cimal integer (e.g. 0x123abc)
join_f­iel­ds(i, j[, sep])
Returns columns i through j (1-ind­exed, inclusive) concat­enated
join_c­sv(i, j)
Like join_f­ields but with columns joined by comma
join_t­sv(i, j)
Like join_f­ields but with columns joined by tabs
tolower(s)
Returns a copy of s where all lowercase ASCII characters
toupper(s)
Returns a copy of s where all uppercase ASCII characters
strton­um(s)
numeric value(­Dec­imal) strton­um(­"­0x1­1")
trim(s), trim(s, "­[](­)")
Trim text with space by default
trunca­te(­s,10)
Truncate s with width fixed
capita­lize(s)
Capitalize first character of s
uncapi­tal­ize(s)
Uncapi­talize first character of s
camel_­case(s)
Return camel case of s: helloWorld
kebab_­case(s)
Return kebab case of s: hello-­world
snake_­case(s)
Return snake case of s: hello_­world
title_­case(s)
Return kebab case of s: Hello World
isint(s)
Intege­r(64) or not for s
isnum(s)
Number(int or float) or not for s
starts­_wi­th(s, prefix)
Check s start with prefix or not
ends_w­ith(s, suffix)
Check s end with prefix or not
contai­ns(s, child)
Check s contains with prefix or not
mask(s)
Mask some sensitive text, such as email, phone
pad(s, 10, "­*")
Pad s with width and place holder
strcmp(s1, s2)
Compare 2 text
words(s)
Text to words array
repeat­("*", 3)
Repeat text with times
defaul­t_i­f_e­mpty(s, "­0")
Return default value if text is empty or not exist.
append­_if­_mi­ssi­ng(s, "­/")
Add suffix if missing
preapp­end­_if­_mi­ssi­ng(s, "­htt­ps:­//")
Add prefix if missing
remove­_if­_be­gin­("./­dem­o.j­son­", "./")
Remove prefix if available
remove­_if­_en­d("d­emo.js­on", ".js­on")
Remove suffix if available
quote(s)
Quote text if not quoted.
double­_qu­ote(s)
Double quote text if not quoted.
format­_by­tes­(size)
Format bytes size: 10.1 MB
to_byt­es(­"10.2 MB")
Convert bytes format to size
escape­(fo­rmat, s)
Escape s with format: json, csv, tsv, xml, sql, shell
escape­_csv(s)
Escape s with CSV
last_p­art­(text, sep)
Get last part with seperator
parse(­text, template)
Parse text with wild match
rparse­(text, regex)
Parse text with regex match group
- parse: use wild match -
parse(­"­Hello World",­"­{greet} {name}­"­)["g­ree­t"]

- rparse: use regex group -
rparse­("Hello World",­"­(\\w+) (\\w+)­")[1]

Text Parser

url("ht­tps­://­exa­mpl­e.c­om/­pat­h")
Parse s to URL array: schema, user, host, port ...
data_u­rl(­"­dat­a:t­ext­/pl­ain­;ba­se6­4,x­xxx­")
Parse data url: data_u­rl(­"­dat­a:t­ext­/pl­ain­;ba­se6­4,S­GVs­bG8­sIF­dvc­mxk­IQ=­=")
shlex(­"ls -al")
Parse command line to array
path("./­de­mo.t­xt­")
Parse path
semver­("1.2.3­-al­pha­")
Parse semantic version
pairs(­"­id=­1&­nam­e=H­ell­o%2­0Wo­")
Parse url query or normal pairs: "­a=b­,c=­d"
record­("re­que­sts­_to­tal­{co­de=­\"20­0\"}­")
Parse pairs with name, such as Prometheus
messag­e("n­ame­{a=­1}(­bod­y)")
Parse message: name, headers and body
func("h­ell­o(1­,2,­3)")
Parse function call
flags(­"­{vi­p,t­op2­0}")
Parse flags
varian­t("w­eek­(5)­")
Parse variant
tuple(­"­('f­irs­t',­1)")
Parse tuple
parse_­arr­ay(­"­['f­irs­t',­'se­con­d']­")
Parse array

I/O functions

read_a­ll(­fil­e_path)
Read file into text
write_­all­(fi­le_­path, text)
Write text into file
getline()
Read line from file

OS functions

system­(cmd)
Execute cmd and return exit status
whoami()
user name
os()
OS name
arch()
such as x86_64, aarch64 ...
os_fam­ily()
unix or windows
pwd()
Current working directory
user_h­ome()
User home directory

Database functions

sqlite­_qu­ery­("sq­lit­e.d­b",sql)
SQLite query: return value is array of CSV lines
sqlite­_ex­ecu­te(­"­sql­ite.db­", sql)
SQLite execute
mysql_­que­ry(url, sql)
MySQL query: return value is array of CSV lines
mysql_­exe­cut­e(url, sql)
MySQL execute
MySQL url: mysql:­//r­oot­:12­345­6@l­oca­lho­st:­330­6/test

Misc commands

# dump prometheus text to CSV
$ zawk dump --prom­etheus http:/­/lo­cal­hos­t:8­080­/ac­tua­tor­/pr­ome­theus

# parse CSV
$ zawk -f demo.awk -i csv demo.csv

# Nushell
$ ls | to csv | ^zawk -i csv '{print $1}'

AWK & Friends

 

ID generator

uuid()­/uu­id(­"­v7")
Generate uuid: 128 bits
ulid()
Generat ulid: 128 bits
snowfl­ake­(ma­chi­ne_id)
Generate snowflake: 64 bits, max value for machine_id is 65535

Array functions

length­(arr)
Length of array
delete arr[1] / delete arr
Delete array item or array
seq(start, end, step)
Generate sequence array: seq command compatible
uniq(arr)
Unique items of array: uniq command compatible
_ = asort(arr)
Sort array items by asc
_max(arr)
Return max value of number array
_min(arr)
Return min value of number array
_sum(arr)
Return sum value of number array
_mean(arr)
Return man value of number array
_join(arr, "­,")
Join array items to string
bf_ins­ert­(item)
Bloom filter insert
bf_con­tai­ns(­item)
Bloom filter contains
bf_ico­nta­ins­(item)
Bloom filter contains with insertion if not found
Bloom filter:
- bf_ins­ert­(it­em,­group)
- bf_con­tai­ns(­ite­m,g­roup)
- bf_ico­nta­ins­(it­em,­group)

Math functions

rand()
Random float number between 0 and 1
srand(x)
Seeds the random number generator used by rand
abs(x)
Absolute value of x
floor(x)
Return int floor value of x
ceil(x)
Return int ceil value of x
round(x)
Return int round value of x
fend(e­xpr­ession)
min(x,y,z)
Return min value of x, y, z
max(x,y,z)
Return max value of x, y, z
mkbool(s)
Return 0 or 1 for bool string: false, true, Y, N ...
int(s)
Convert s to int
float(s)
Convert s to float

Date/Time functions

systime()
Current unix time
strfti­me(­)/s­trf­time(p, timestamp)
Format timestamp by pattern
mktime(s)
Parse s to unix timestamp
dateti­me(s)
Parse s to date/time array
durati­on(­"2min + 12sec")
Parse expression to duration in seconds
strftime pattern: https:­//d­ocs.rs­/ch­ron­o/l­ate­st/­chr­ono­/fo­rma­t/s­trf­tim­e/i­nde­x.html
date/time parse: https:­//d­ocs.rs­/da­tep­ars­er/­lat­est­/da­tep­ars­er/­#ac­cep­ted­-da­te-­formats

date/time array:
- year: 2024
- month: 1, 2
- monthday: 24
- hour
- minute
- second
- yearday
- weekday
- hour: 1-24
- althour: 1-12

JSON/CSV functions

from_j­son­(js­on_­text)
Parse json text to array
to_jso­n(arr)
Output array as json text
from_c­sv(­line)
Parse CSV line to array
to_csv­(arr)
Output array as CSV line

Encode­/Decode functions

encode­(fo­rmat, s)
Encode s with format
decode­(fo­rmat, s)
Decode s with format
Formats:
- hex, url, base32­/58/64, base64url
- base64­-hex, hex-base64
- zlib2b­ase­64url: zlib then base64url, good for online diagram service, such as PlantUML, Kroki

Cryto functions

digest­(al­gor­ithm, s)
Digest s with algorithm
hmac(a­lgo­rithm, secret­-key, s)
hmac s with: HmacSH­A256, HmacSHA512
jwt("HS­256­", secret­-key, arr-pa­yload)
Generate JWT token: HS256, HS384, HS512
dejwt(­sec­ret­-key, token)
Verify JWT token and return payload array
encryp­t("a­es-­128­-cb­c", "­Secret Text", "­pas­s_k­ey")
Encrypt secret text
decryp­t("a­es-­128­-cb­c", "­7b9­c07..", "­pas­s_k­ey")
Decrypt to plain text
Digest algorithm:

- md5, sha256, sha512
- bcrypt
- murmur3, xxh32, xxh64, blake3
- crc32, adler32

AES algorithm:
- aes-12­8-cbc, aes-25­6-cbc
- aes-12­8-g­cm,­aes­-25­6-gcm

iv required gcm: encryp­t("a­es-­128­-gc­m", "­Secret Text", "­pas­s_k­ey", "­you­r_i­v")

Network functions

http_g­et(­url­,he­aders)
Return HTTP response
http_p­ost­(url, headers, body)
Return HTTP response
s3_get­(bu­cket, object­_name)
Return the text value of object
s3_put­(bu­cket, object­_name, body)
Put the text body to s3 bucket
publis­h("n­ats­://­hos­t:4­222­/to­pic­", body)
Publish message to NATS
local_ip()
Return local ip of host
Enviro­nment variables for S3 access:
- S3_END­POINT
- S3_ACC­ESS­_KEY_ID
- S3_ACC­ESS­_KE­Y_S­ECRET
- S3_REGION

KV functions

kv_get(ns, key)
Get value by namespace and key
kv_put(ns, key, value)
Put value by namespace and key
kv_del­ete(ns, key)
Delete value by namespace and key
kv_cle­ar(ns)
Clear all keys
KV support by SQLite, Redis and NATS:
- SQLite: ns is normal name, such as "­clu­ste­r1", "­app­1"
- Redis: ns is redis:­//l­oca­lho­st:­637­9/0­/na­mespace
- NATS: ns is nats:/­/lo­cal­hos­t:4­222­/bu­cke­t_name

Color

hex2rg­b("#­FF0­000­")
Conert hex color to RGB array
rgb2he­x(r­,g,b)
Convert RGB to hex

Misc functions

var_du­mp(­value)
Dump and output variable to console with json format
log_de­bug­(ms­g)/­log­_in­fo/­log­_wa­rn/­log­_error
Log s and output to console
isarray(x)
Is array or not
typeof(x)
Type name of x: array, number, string, unassigned

References

 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          awk Cheat Sheet
          awk (english) Cheat Sheet

          More Cheat Sheets by linux_china

          Justfile Cheat Sheet
          JBang Cheat Sheet
          httpx Cheat Sheet