diff options
author | Aaron Ball <nullspoon@oper.io> | 2023-05-25 12:15:13 -0600 |
---|---|---|
committer | Aaron Ball <nullspoon@oper.io> | 2023-05-25 12:15:13 -0600 |
commit | c9bfeb885a6b5f4ebea3ef5ff26023ac506daca3 (patch) | |
tree | 61582a6ea2a3519f5df6073053f5f6746f7ec94e | |
parent | 432a282e3cb849ee8a5b29c8883f683bc6113229 (diff) | |
download | oper.io-c9bfeb885a6b5f4ebea3ef5ff26023ac506daca3.tar.gz oper.io-c9bfeb885a6b5f4ebea3ef5ff26023ac506daca3.tar.xz |
Case Insensitive Matching in C: Convert to markdown
-rw-r--r-- | posts/case_insensitive_matching_in_c.md (renamed from posts/case_insensitive_matching_in_c.adoc) | 97 |
1 files changed, 41 insertions, 56 deletions
diff --git a/posts/case_insensitive_matching_in_c.adoc b/posts/case_insensitive_matching_in_c.md index a2d7e1c..4ece0e0 100644 --- a/posts/case_insensitive_matching_in_c.adoc +++ b/posts/case_insensitive_matching_in_c.md @@ -1,51 +1,46 @@ Case Insensitive Matching in C++ ================================ -:author: Aaron Ball -:email: nullspoon@iohq.net - - -I had this epiphany yesterday while working on my new command line -https://oper.io/src/nullspoon/noteless.git[note-taking project] and I wanted to -write a blog post about it since I haven't seen anyone on the internet yet take -this approach (though there aren't exactly a lot blogs posts on programming -theory of this kind in general). +I had this epiphany yesterday while working on my new command line [note-taking +project](https://oper.io/src/nullspoon/noteless.git) and I wanted to write a +blog post about it since I haven't seen anyone on the internet yet take this +approach (though there aren't exactly a lot blogs posts on programming theory +of this kind in general). My program is written in C. It provides a search functionality very similar to the case insensitive matching of _grep -i_ (you 'nix users should know what I'm talking about). If you've done much in C, you likely know that string parsing -is not so easy (or is it just different). Thus the question...__how to perform -case insensitive text searching in c__. +is not so easy (or is it just different). Thus the question... _how to perform +case insensitive text searching in c_. A few notes though before we proceed. I'm fairly new to c (about 1 year as a hobby) so everything I say here might not be entirely right (it'll work, it just might not be the _best_ way). If you catch something that's wrong or could -use improvement, please send me link:/?p=About[an email]. Secondly, since this -is probably something the C gods have already mastered, I will be writing -this post aimed at the newer folk (since I myself am one), so bear with me if -you already know how to do this. One final note. I am still ceaselessly amazed -at how computers work, so I get fairly giddy when it comes to actual memory +use improvement, please send me [an email](/?p=About). Secondly, since this is +probably something the C gods have already mastered, I will be writing this +post aimed at the newer folk (since I myself am one), so bear with me if you +already know how to do this. One final note. I am still ceaselessly amazed at +how computers work, so I get fairly giddy when it comes to actual memory management and whatnot. Brace yourselves... -[[chars-ints-kind-of]] Chars == Ints (kind of) ----------------------- To continue, we need to understand a few things about base data types in memory. -* **Ints**: An int is just 8 bits of memory (well, it's 16 including -signing, but we don't need to cover that here). +* **Ints**: An int is just 8 bits of memory (well, it's 16 including signing, + but we don't need to cover that here). -* **Chars**: Chars are just ints, but marked as chars. Effectively, a -number has been assigned to each letter and symbol (including uppercase and -lowercase), which is where integers meet chars. The integer determines which -char is selected. +* **Chars**: Chars are just ints, but marked as chars. Effectively, a number + has been assigned to each letter and symbol (including uppercase and + lowercase), which is where integers meet chars. The integer determines which + char is selected. To demonstrate those two data types, let's take a look at some sample code. ----- +``` using namespace std; #include <iostream> @@ -56,40 +51,38 @@ int main( int argc, char** argv ) { cout << " is the same as char " << c << "!" << endl; return 0; } ----- +``` -What we do here is create <code>int i</code> with the value of 72. We -then create <code>char c</code> and assign it the value of _i_ (still -72). Finally, we print both int i and char c and get... +What we do here is create `int i` with the value of `72`. We then create `char +c` and assign it the value of `i` (still 72). Finally, we print both `int i` +and `char c` and get... ----- +``` The integer 72 is the same as char H! ----- +``` -If you're wondering, we could have also just assigned char c the value -of 72 explicitly and it would have still printed the letter H. +If you're wondering, we could have also just assigned char c the value of 72 +explicitly and it would have still printed the letter H. Now that that's out of the way... -[[a-short-char---integer-list]] A Short Char - Integer List --------------------------- -* **! " # $ % & ' ( ) * + , - . /**: 35 - 47 +* `! " # $ % & ' ( ) * + , - . /`: 35 - 47 -* **0-9**: 48 - 57 +* `0-9`: 48 - 57 -* **: ; < = > ? @**: 58 - 64 +* `: ; < = > ? @`: 58 - 64 -* *A - Z* (uppercase): 65 - 90 +* `A - Z` _(uppercase)_: 65 - 90 -* **[ \ ] ^ _ `**: 91 - 96 +* `` [ \ ] ^ _ ` ``: 91 - 96 -* *a - z* (lowercase): 97 - 122 +* `a - z` _(lowercase)_: 97 - 122 -[[lowercase-uppercase-32]] Lowercase == Uppercase + 32 --------------------------- @@ -104,7 +97,6 @@ equivalent is going to be 32 lower (int 65). Suddenly parsing text just got a lot easier. -[[piecing-it-all-together]] Piecing it all together ----------------------- @@ -112,13 +104,13 @@ Since characters are simply just integers, we can perform text matching via number ranges and math operators. For instance... Suppose you want to build a password validator that allows numbers, upper case, -lower case, and __: ; < = > ? @ [ \ ] ^ _ `__. That is the integer range 48 - +lower case, and `` : ; < = > ? @ [ \ ] ^ _ ` ``. That is the integer range 48 - 57 (the char equivelants of integers), 58 - 64 (the first symbols), 65 - 90 (the uppercase), 91 - 96 (the second set of symbols), and 97-122 (the lowercase). Combining those ranges, the allowable characters make up the integer range of 48 - 122. Thus, our program might look something like... ----- +``` using namespace std; #include <iostream> @@ -153,14 +145,14 @@ int main( int argc, char** argv ) { } return 0; } ----- +``` Will output... ----- +``` good_password123 is valid. bad_password! is not valid. ----- +``` The first password succeeds because all of its characters are within the range of 48 - 122. The second password fails because its final character, the "!", is @@ -182,13 +174,6 @@ usable through conversion to integers. It makes me want to see what other real-life data I can convert to numbers for easier parsing. Images? Chemistry notation? -I do say my good man, http://www.bartleby.com/70/1322.html[Why, then the -world’s mine oyster, Which I with numbers will open.] (okay, I may have -modified the quote a tad) - - -Category:Programming -Category:C - - -// vim: set syntax=asciidoc: +I do say my good man, [Why, then the world’s mine oyster, Which I with numbers +will open.](http://www.bartleby.com/70/1322.html) (okay, I may have modified +the quote a tad) |