{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Linux Operation System as a base for Spatial Ecology Computing\n", "\n", "[Linux](https://linux.org/) is a generic term refering to Unix-like computer operating systems based on the Linux kernel. Their development is one of the most prominent examples of free and open source software collaboration; typically all the underlying source code can be used, freely modified, and redistributed, both commercially and non-commercially, by anyone under licenses such as the [GNU](https://www.gnu.org/).\n", "\n", "In this site an introduction will be given to the [Unix/Linux Shell](https://en.wikipedia.org/wiki/Unix_shell) using [Bash](https://en.wikipedia.org/wiki/Bash) language to manipulate data rather than interacting with/setting the operation system. The final aim is to build a stand-alone implementation / processes that include a combination of bash/R/AWK/gnuplot commands that can be run several times using the features of each software.\n", "In this part of the training site we provide various examples of bash commands reported in this [Unix/Linux Command Reference](https://files.fosswire.com/2007/08/fwunixref.pdf).\n", "\n", "In the jupyter-notebook you can call/use bash language by using two symbols:\n", "\n", " %%bash\n", " bash-command\n", "\n", "before the bash commands, or\n", "\n", " ! bash-command\n", "\n", "followed by the bash commands\n", "\n", "\n", "## Bash language syntax\n", "\n", "The object of this document is the use of Bash language to explore and manipulate files rather than to set/interact with the operation system. You can read and follow jupter-notebook or you can copy the commands included in the frames part of this document and paste them into an interactive Bash shell. Once you have familiarity with the general commands of Bash you can further advance in learning bash with online manuals and guides. There is a large variety of documentation available at: http://www.linux.org/lessons/advanced/x1110.html\n", "http://tldp.org/LDP/abs/html/\n", "\n", "The best way is just to try each command using a file, and/or search on the Internet for more examples and deeper explanations.\n", "\n", "**Searching for a command, getting help**\n", "\n", "In a shell window (the terminal) the following prompt is written:\n", "\n", " user@pc_name:directrory$\n", "\n", "after the $ you are able to insert the command\n", "Command syntax:\n", "\n", " command [option] [file]\n", "\n", "The square bracts \"[ ]\" identify an optional feature of the command. It can be inserted to retrieve more information or different setting of a command.\n", "To get a command for a specific action type \"man -k thewordthatyouneed\"\n", "\n", "e.g. I want to search for a command able to count the line in a file" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "acct (2) - switch process accounting on or off\n", "acct (5) - process accounting file\n", "argz_count (3) - functions to handle an argz list\n", "cksum (1) - checksum and count the bytes in a file\n", "CPU_COUNT (3) - macros for manipulating CPU sets\n", "CPU_COUNT_S (3) - macros for manipulating CPU sets\n", "error_message_count (3) - glibc error reporting functions\n", "ibv_attach_counters_point_flow (3) - attach individual counter definition to ...\n", "ibv_destroy_counters (3) - Create or destroy a counters handle\n", "ibv_read_counters (3) - Read counter values\n", "fincore (1) - count pages of file contents in core\n", "get_avphys_pages (3) - get total and available physical page counts\n", "get_phys_pages (3) - get total and available physical page counts\n", "git-count-objects (1) - Count unpacked number of objects and their disk consu...\n", "goa-daemon (8) - GNOME Online Accounts Daemon\n", "ibv_create_counters (3) - Create or destroy a counters handle\n", "mlx5dv_dr_action_create_flow_counter (3) - Create devx flow counter actions\n", "mlx5dv_ts_to_ns (3) - Convert device timestamp from HCA core clock units to ...\n", "pam_lastlog (8) - PAM module to display date of last login and perform i...\n", "pam_succeed_if (8) - test account characteristics\n", "pam_tally (8) - The login counter (tallying) module\n", "pam_tally2 (8) - The login counter (tallying) module\n", "pcre16_refcount (3) - Perl-compatible regular expressions\n", "pcre2_get_ovector_count (3) - Perl-compatible regular expressions (revised API)\n", "pcre32_refcount (3) - Perl-compatible regular expressions\n", "pcre_refcount (3) - Perl-compatible regular expressions\n", "rdma-statistic (8) - RDMA statistic counter configuration\n", "sum (1) - checksum and count the blocks in a file\n", "systemd-bless-boot-generator (8) - Pull systemd-bless-boot.service into the i...\n", "timer_getoverrun (2) - get overrun count for a POSIX per-process timer\n", "userdel (8) - delete a user account and related files\n", "usermod (8) - modify a user account\n", "v.in.geonames (1grass) - Imports geonames.org country files into a vector poi...\n", "v.qcount (1grass) - Indices for quadrat counts of vector point lists.\n", "v.vect.stats (1grass) - Count points in areas, calculate statistics from poin...\n", "wc (1) - print newline, word, and byte counts for each file\n" ] } ], "source": [ "! man -k count" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "in the last lines you get:\n", "\n", "“wc (1) - print newline, word, and byte counts for each file”\n", "\n", "so the command “wc” is your command. To get information about a command type “man command” or info “command” e.g." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WC(1) User Commands WC(1)\r\n", "\r\n", "N\bNA\bAM\bME\bE\r\n", " wc - print newline, word, and byte counts for each file\r\n", "\r\n", "S\bSY\bYN\bNO\bOP\bPS\bSI\bIS\bS\r\n", " w\bwc\bc [_\bO_\bP_\bT_\bI_\bO_\bN]... [_\bF_\bI_\bL_\bE]...\r\n", " w\bwc\bc [_\bO_\bP_\bT_\bI_\bO_\bN]... _\b-_\b-_\bf_\bi_\bl_\be_\bs_\b0_\b-_\bf_\br_\bo_\bm_\b=_\bF\r\n", "\r\n", "D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN\r\n", " Print newline, word, and byte counts for each FILE, and a total line if\r\n", " more than one FILE is specified. A word is a non-zero-length sequence\r\n", " of characters delimited by white space.\r\n", "\r\n", " With no FILE, or when FILE is -, read standard input.\r\n", "\r\n", " The options below may be used to select which counts are printed, al‐\r\n", " ways in the following order: newline, word, character, byte, maximum\r\n", " line length.\r\n", "\r\n", " -\b-c\bc, -\b--\b-b\bby\byt\bte\bes\bs\r\n", " print the byte counts\r\n", "\r\n", " -\b-m\bm, -\b--\b-c\bch\bha\bar\brs\bs\r\n", " print the character counts\r\n", "\r\n", " -\b-l\bl, -\b--\b-l\bli\bin\bne\bes\bs\r\n", " print the newline counts\r\n", "\r\n", " -\b--\b-f\bfi\bil\ble\bes\bs0\b0-\b-f\bfr\bro\bom\bm=_\bF\r\n", " read input from the files specified by NUL-terminated names in\r\n", " file F; If F is - then read names from standard input\r\n", "\r\n", " -\b-L\bL, -\b--\b-m\bma\bax\bx-\b-l\bli\bin\bne\be-\b-l\ble\ben\bng\bgt\bth\bh\r\n", " print the maximum display width\r\n", "\r\n", " -\b-w\bw, -\b--\b-w\bwo\bor\brd\bds\bs\r\n", " print the word counts\r\n", "\r\n", " -\b--\b-h\bhe\bel\blp\bp display this help and exit\r\n", "\r\n", " -\b--\b-v\bve\ber\brs\bsi\bio\bon\bn\r\n", " output version information and exit\r\n", "\r\n", "A\bAU\bUT\bTH\bHO\bOR\bR\r\n", " Written by Paul Rubin and David MacKenzie.\r\n", "\r\n", "R\bRE\bEP\bPO\bOR\bRT\bTI\bIN\bNG\bG B\bBU\bUG\bGS\bS\r\n", " GNU coreutils online help: \r\n", " Report wc translation bugs to \r\n", "\r\n", "C\bCO\bOP\bPY\bYR\bRI\bIG\bGH\bHT\bT\r\n", " Copyright © 2018 Free Software Foundation, Inc. License GPLv3+: GNU\r\n", " GPL version 3 or later .\r\n", " This is free software: you are free to change and redistribute it.\r\n", " There is NO WARRANTY, to the extent permitted by law.\r\n", "\r\n", "S\bSE\bEE\bE A\bAL\bLS\bSO\bO\r\n", " Full documentation at: \r\n", " or available locally via: info '(coreutils) wc invocation'\r\n", "\r\n", "GNU coreutils 8.30 September 2019 WC(1)\r\n" ] } ], "source": [ "! man wc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Input/Output redirect\n", "**Running a command, saving a result**\n", "\n", "The symbols \">\" are used to save the result of a command in a file. Instead \"<\" is used to retrieve information from a file. In these cases, using the informatics terminology we can use the expression 'standard input redirection\" or and \"standard output redirection\".\n", "\n", "This [page](https://fsl.fmrib.ox.ac.uk/fslcourse/unix_intro/io.html) summarize the Standard Input and Output Redirection commonly used.\n", "\n", "In this course we will mainly use the symbol \">\", \">>\", \"<\". e.g." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "00_Setting_Colab_for_for_Spatial_Ecology_course.ipynb 02_pktools_osgeo.ipynb\r\n", "01_gdal.ipynb\t\t\t\t\t 03_bash_osgeo.ipynb\r\n", "02_pktools_colab.ipynb\t\t\t\t geodata\r\n" ] } ], "source": [ "!ls" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "! ls > mylist.txt" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "00_Setting_Colab_for_for_Spatial_Ecology_course.ipynb\r\n", "01_gdal.ipynb\r\n", "02_pktools_colab.ipynb\r\n", "02_pktools_osgeo.ipynb\r\n", "03_bash_osgeo.ipynb\r\n", "geodata\r\n", "mylist.txt\r\n" ] } ], "source": [ "! more mylist.txt" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "! ls >> mylist.txt" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "00_Setting_Colab_for_for_Spatial_Ecology_course.ipynb\r\n", "01_gdal.ipynb\r\n", "02_pktools_colab.ipynb\r\n", "02_pktools_osgeo.ipynb\r\n", "03_bash_osgeo.ipynb\r\n", "geodata\r\n", "mylist.txt\r\n", "00_Setting_Colab_for_for_Spatial_Ecology_course.ipynb\r\n", "01_gdal.ipynb\r\n", "02_pktools_colab.ipynb\r\n", "02_pktools_osgeo.ipynb\r\n", "03_bash_osgeo.ipynb\r\n", "geodata\r\n", "mylist.txt\r\n" ] } ], "source": [ "! more mylist.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Special Characters\n", "Special characters, also called metacharacters, are a group of characters that have particular meanings in the bash language. Listed here are those used in the following scripts. Type the examples and try to get the meaning.\n", "\n", "The asterisk \"*\" symbol identifies a string with one or more character " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/dev/tty /dev/tty23\t/dev/tty39 /dev/tty54\t /dev/ttyS10 /dev/ttyS26\r\n", "/dev/tty0 /dev/tty24\t/dev/tty4 /dev/tty55\t /dev/ttyS11 /dev/ttyS27\r\n", "/dev/tty1 /dev/tty25\t/dev/tty40 /dev/tty56\t /dev/ttyS12 /dev/ttyS28\r\n", "/dev/tty10 /dev/tty26\t/dev/tty41 /dev/tty57\t /dev/ttyS13 /dev/ttyS29\r\n", "/dev/tty11 /dev/tty27\t/dev/tty42 /dev/tty58\t /dev/ttyS14 /dev/ttyS3\r\n", "/dev/tty12 /dev/tty28\t/dev/tty43 /dev/tty59\t /dev/ttyS15 /dev/ttyS30\r\n", "/dev/tty13 /dev/tty29\t/dev/tty44 /dev/tty6\t /dev/ttyS16 /dev/ttyS31\r\n", "/dev/tty14 /dev/tty3\t/dev/tty45 /dev/tty60\t /dev/ttyS17 /dev/ttyS4\r\n", "/dev/tty15 /dev/tty30\t/dev/tty46 /dev/tty61\t /dev/ttyS18 /dev/ttyS5\r\n", "/dev/tty16 /dev/tty31\t/dev/tty47 /dev/tty62\t /dev/ttyS19 /dev/ttyS6\r\n", "/dev/tty17 /dev/tty32\t/dev/tty48 /dev/tty63\t /dev/ttyS2\t /dev/ttyS7\r\n", "/dev/tty18 /dev/tty33\t/dev/tty49 /dev/tty7\t /dev/ttyS20 /dev/ttyS8\r\n", "/dev/tty19 /dev/tty34\t/dev/tty5 /dev/tty8\t /dev/ttyS21 /dev/ttyS9\r\n", "/dev/tty2 /dev/tty35\t/dev/tty50 /dev/tty9\t /dev/ttyS22\r\n", "/dev/tty20 /dev/tty36\t/dev/tty51 /dev/ttyprintk /dev/ttyS23\r\n", "/dev/tty21 /dev/tty37\t/dev/tty52 /dev/ttyS0\t /dev/ttyS24\r\n", "/dev/tty22 /dev/tty38\t/dev/tty53 /dev/ttyS1\t /dev/ttyS25\r\n" ] } ], "source": [ "! ls /dev/tty*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The questionmark \"?\" symbol identifies a a single character" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/dev/tty0\n", "/dev/tty1\n", "/dev/tty2\n", "/dev/tty3\n", "/dev/tty4\n", "/dev/tty5\n", "/dev/tty6\n", "/dev/tty7\n", "/dev/tty8\n", "/dev/tty9\n" ] } ], "source": [ "%%bash\n", "ls /dev/tty?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The square brackets \"[ ]\" identify one of a single character listed" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/dev/tty2 /dev/tty3 /dev/tty4\r\n" ] } ], "source": [ "! ls /dev/tty[2-4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Curly brackets \"{}\" symbol identify one of a single string listed" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/dev/loop0\n", "/dev/loop1\n", "/dev/loop10\n", "/dev/loop11\n", "/dev/loop12\n", "/dev/loop13\n", "/dev/loop14\n", "/dev/loop15\n", "/dev/loop16\n", "/dev/loop2\n", "/dev/loop3\n", "/dev/loop4\n", "/dev/loop5\n", "/dev/loop6\n", "/dev/loop7\n", "/dev/loop8\n", "/dev/loop9\n", "/dev/loop-control\n", "/dev/tty\n", "/dev/tty0\n", "/dev/tty1\n", "/dev/tty10\n", "/dev/tty11\n", "/dev/tty12\n", "/dev/tty13\n", "/dev/tty14\n", "/dev/tty15\n", "/dev/tty16\n", "/dev/tty17\n", "/dev/tty18\n", "/dev/tty19\n", "/dev/tty2\n", "/dev/tty20\n", "/dev/tty21\n", "/dev/tty22\n", "/dev/tty23\n", "/dev/tty24\n", "/dev/tty25\n", "/dev/tty26\n", "/dev/tty27\n", "/dev/tty28\n", "/dev/tty29\n", "/dev/tty3\n", "/dev/tty30\n", "/dev/tty31\n", "/dev/tty32\n", "/dev/tty33\n", "/dev/tty34\n", "/dev/tty35\n", "/dev/tty36\n", "/dev/tty37\n", "/dev/tty38\n", "/dev/tty39\n", "/dev/tty4\n", "/dev/tty40\n", "/dev/tty41\n", "/dev/tty42\n", "/dev/tty43\n", "/dev/tty44\n", "/dev/tty45\n", "/dev/tty46\n", "/dev/tty47\n", "/dev/tty48\n", "/dev/tty49\n", "/dev/tty5\n", "/dev/tty50\n", "/dev/tty51\n", "/dev/tty52\n", "/dev/tty53\n", "/dev/tty54\n", "/dev/tty55\n", "/dev/tty56\n", "/dev/tty57\n", "/dev/tty58\n", "/dev/tty59\n", "/dev/tty6\n", "/dev/tty60\n", "/dev/tty61\n", "/dev/tty62\n", "/dev/tty63\n", "/dev/tty7\n", "/dev/tty8\n", "/dev/tty9\n", "/dev/ttyprintk\n", "/dev/ttyS0\n", "/dev/ttyS1\n", "/dev/ttyS10\n", "/dev/ttyS11\n", "/dev/ttyS12\n", "/dev/ttyS13\n", "/dev/ttyS14\n", "/dev/ttyS15\n", "/dev/ttyS16\n", "/dev/ttyS17\n", "/dev/ttyS18\n", "/dev/ttyS19\n", "/dev/ttyS2\n", "/dev/ttyS20\n", "/dev/ttyS21\n", "/dev/ttyS22\n", "/dev/ttyS23\n", "/dev/ttyS24\n", "/dev/ttyS25\n", "/dev/ttyS26\n", "/dev/ttyS27\n", "/dev/ttyS28\n", "/dev/ttyS29\n", "/dev/ttyS3\n", "/dev/ttyS30\n", "/dev/ttyS31\n", "/dev/ttyS4\n", "/dev/ttyS5\n", "/dev/ttyS6\n", "/dev/ttyS7\n", "/dev/ttyS8\n", "/dev/ttyS9\n" ] } ], "source": [ "%%bash\n", "ls /dev/{tty,loop}*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quoting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can prevent the shell from interpreting a metacharacter by placing a backslash \"\\\".\n", "In this way the metacharacter become a normal character.\n", "\n", "file1 will be copied to file?" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "! cp mylist.txt mylist\\?.txt\n", "! ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also insert the metacharacter between quotation marks." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ls: cannot access '/dev/tt*': No such file or directory\r\n" ] } ], "source": [ "! ls /dev/\"tt*\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pipe\n", "\n", "The pipe \"|\" metacharacter enables you to run a set of chained processes.\n", "To understand lets do an example creating a temporal file called tmp.txt and counting how many lines there are in the file." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2227 tmp.txt\n" ] } ], "source": [ "%%bash\n", "ls /usr/bin > tmp.txt\n", "wc -l tmp.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The same can be written" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2227\r\n" ] } ], "source": [ "! ls /usr/bin | wc -l" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "without creating an intermediate file." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }