{"id":236,"date":"2025-07-20T14:26:49","date_gmt":"2025-07-20T13:26:49","guid":{"rendered":"https:\/\/mamdouh.de\/?p=236"},"modified":"2025-07-20T18:54:05","modified_gmt":"2025-07-20T17:54:05","slug":"csv","status":"publish","type":"post","link":"https:\/\/mamdouh.de\/index.php\/2025\/07\/20\/csv\/","title":{"rendered":"CSV"},"content":{"rendered":"\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Relationships are like CSVs\u2014one wrong delimiter, and it all breaks.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"has-medium-font-size\">CSV stands for <strong><em>Comma-Separated Values<\/em><\/strong>. It&#8217;s a plain text format that stores tables. Each line holds one record, and fields within a record are split by commas. Many programs use CSV files because they&#8217;re easy to create, read, and move between systems. To work with CSV data from the command line, especially for filtering and analyzing, csvkit is a useful tool. Down are easy commands for basic CSV file EDA:<\/p>\n\n\n\n<h1 class=\"wp-block-heading has-medium-font-size\">Step 1: Install or upgrade csvkit (in windows 11)<\/h1>\n\n\n\n<p><code>pip install --upgrade csvkit<\/code><\/p>\n\n\n\n<h1 class=\"wp-block-heading has-medium-font-size\">Step 2: Extract rows where the &#8220;Attack_type&#8221; column contains &#8220;Normal&#8221;<\/h1>\n\n\n\n<p><code>csvgrep -c \"Attack_type\" -r \"Normal\" data.csv &gt; normal_rows.csv<\/code><\/p>\n\n\n\n<h1 class=\"wp-block-heading has-medium-font-size\">Step 3: View structure and summary stats of the filtered file<\/h1>\n\n\n\n<p><code>csvstat ransomware_rows.csv<\/code><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"409\" height=\"180\" src=\"https:\/\/mamdouh.de\/wp-content\/uploads\/2025\/07\/csvstat.png\" alt=\"\" class=\"wp-image-239\"\/><\/figure>\n\n\n\n<h1 class=\"wp-block-heading has-medium-font-size\">Step 4: Check for missing or summary stats in the &#8220;dns.qry.name.len&#8221; column (same as step 3 but here for specific column)<\/h1>\n\n\n\n<p><code>csvstat -c \"dns.qry.name.len\" normal_rows.csv<\/code><\/p>\n\n\n\n<h1 class=\"wp-block-heading has-medium-font-size\">Step 5: See frequency distribution for &#8220;dns.qry.name.len&#8221;<\/h1>\n\n\n\n<p><code>csvcut -c \"dns.qry.name.len\" normal_rows.csv | csvstat --freq<\/code><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Relationships are like CSVs\u2014one wrong delimiter, and it all breaks. CSV stands for Comma-Separated Values. It&#8217;s a plain text format that stores tables. Each line holds&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-236","post","type-post","status-publish","format-standard","hentry","category-cybersecurity"],"_links":{"self":[{"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/posts\/236","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/comments?post=236"}],"version-history":[{"count":13,"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/posts\/236\/revisions"}],"predecessor-version":[{"id":260,"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/posts\/236\/revisions\/260"}],"wp:attachment":[{"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/media?parent=236"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/categories?post=236"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mamdouh.de\/index.php\/wp-json\/wp\/v2\/tags?post=236"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}