Preston Thorpe

Incarcerated FOSS developer, back-end and systems dev. Command-line junkie

OCaml - Rust: A comparison of simple solutions

Following up on the first impressions post, let’s solve a problem in OCaml and compare the Rust solution.

In Response to how well received my last post was,

I thought I would follow up with some comparisons between how I would solve a simple problem (Advent of code 2022, day 3 part 1) in both languages and compare the solutions, while trying to portray the kind of thinking required (I think anyway) to begin to solve problems in (OCa)ml style languages.

The only thing I will say first.. is these kind of simple questions do not lend themselves to great comparisons, because someone well versed in their language of choice, will be much more likely and able to just solve the problem in a neat, concise, way probably in a single function

The Problem:

I’m not going to bother with the story, if you are interested, you can read it here.

I will also be leaving out the reading in of the file, as it isn’t specific to this problem, and everyone will usually have an ‘aoc’ page of helper fn’s for this anyway.

The Basics:

We need to take a string, split it in half, and then find the

one single char that occurs in both substrings, and sum the val

given to each char (a-z = 1-26) (A-Z = 27-52)

Pretty simple right? If it wasn’t for assigning the values to

chars (or if they were just ascii values), then I’d say that’s

a good-ol’ python one-liner if I ever saw one.

For the sake of simplicity. We need:

"vJrwpWtwJgWrhcsFMMfFFhFp"

to split into:

("vJrwpWtwJgWr", "hcsFMMfFFhFp")
      ^                      ^
p is the char that occurs in both substrings

     the value of p is 16 

we simply add this value for each original string.

Because I am an Ocaml n00b:

We will break this up into 3 simple functions (and a main), and try to solve the problem in relatively the same steps in both languages, so we can give them a fair comparison of basic operations.

Split the list of strings, into a list of tuples.
With that input, find and return the suspect char.
Perform some lookup or calculation to get and sum the values of the chars.
Call these from a main function + print to stdout.

🦀 🦀

// *** We iterate over the list of strings, split and collect into vec of tuples ***
fn split_string(s: Vec<&str>) -> Vec<(&str, &str)> {
    s.iter()
        .map(|line| {
            let mid = line.len() / 2;
            let (first, second) = line.split_at(mid);
            (first, second)
        })
        .collect()
}
// *** Find the character that occurs in both substrings ***
fn find_dupe(s: (&str, &str)) -> Option<char> {
    s.0.chars().find(|&c| s.1.contains(c))
}
// *** Map the char to its value ***
fn map_value(c: char) -> usize {
    if c.is_lowercase() {
        ('a'..='z')
            .enumerate() // enumerate gives us index, value so we return the index + 1 for the char
            .find_map(|(i, x)| if x == c { Some(i + 1) } else { None })
            .unwrap_or(0)
    } else {
        ('A'..='Z')
            .enumerate()
            .find_map(|(i, x)| if x == c { Some(i + 27) } else { None })
            .unwrap_or(0)
    }
}
fn main() {
    let file = std::fs::read_to_string("input").unwrap();
    let s = file.split('\n').collect::<Vec<&str>>();
    let spltlines = split_string(s);
    let total = spltlines
        .iter()
        .map(|t| {
            if let Some(val) = find_dupe(*t) {
                map_value(val)
            } else {
                0
            }
        })
        .sum::<usize>();
  println!("{total}");
}
// As I'm sure some will point out, we definitely could have made this a lot more concise
// But for the sake of comparison, we will leave it like this

So this was a touch more painful than I thought it would be… Forcing myself to split those up into three different functions when I would normally for such a simple problem just cram it all into one, being able to operate more by chaining iterators.

But I think you could definitely come up with worse examples.

Time for the 🐫’Caml 🐪

For those of you who aren’t familiar with Ocaml, I’ll try to briefly explain my very limited understanding.

(* In Ocaml, functions from other modules are called with 'Module.fn' syntax,
so String.length is the length fn from the String module. I do like how Mod
names are capitalized, instead of the public fn name like in Go. *)

(* Here we have a recursive function(denoted by the rec keyword)
called 'split_string' that takes our input lst, and an 
accumulator (an empty list in which to put the result: our tuples) *)

let rec split_string (lst: string list) acc: (string * string) list =
  match lst with
  | [] -> acc
  | hd ::tl -> let len = (String.length hd) / 2 in
              let substr_a = String.sub hd 0 len in 
              let substr_b = String.sub hd len len in
              split_string tl ((substr_a, substr_b)::acc)
;;
(* We pattern-match on the list, and while there is still some strings remaining,
we get the length of the string(hd) and then call String.sub on(which takes the string,
the position at which the substring starts, and the length of the substring) to split
it into the tuple. we append the tuple to the accumulator with the 'cons' :: operator
and then call the function recursively with the accumulator + the remainder of the list*)
------------------------------------------------------------------------------------------

(* Here we have a function with no parameters that simply computes
the integer values of lower+uppercase chars and creates a (key val) style list
of (char int) tuples e.g. ('p', 16) ('q', 17)... so pretty much seems to me like a
super inefficient hashmap. I'm sure all you OCaml wizards are cringing rn *)

let char_to_int (): (char * int) list =
  let rec mapping acc start_char end_char value =
    if start_char > end_char then
      acc
    else
      mapping ((start_char, value) :: acc) (Char.chr (Char.code start_char + 1)) end_char (value + 1)
  in
  let lower = mapping [] 'a' 'z' 1 in
  let upper = mapping [] 'A' 'Z' 27 in
  List.rev_append lower upper

(* We call that function, and then use List.assoc to get the value in our key, val tuple *)
let char_to_int_mapping = char_to_int ()
let find_value char_to_find =
  try
    let result = List.assoc char_to_find char_to_int_mapping in
    Some result
  with Not_found -> None
;;
-------------------------------------------------------------------------------
(* Simple function to find the char between the two substrings, utilizing
 String.fold_left which allows us to apply a fn on each char of a string,
so we call Strings.contains on the other string on each char, and append it *)

let find_common_char (s1, s2) =
  let common_chars =
    String.fold_left
      (fun acc c -> if String.contains s2 c then c :: acc else acc)
      [] s1
  in
  match common_chars with
  | [] -> None
  | common_char :: _ -> Some common_char
;;

(* Another recursive function to sum up the values of the chars. we cheated in
rust by having this in the main fn. but notice how, here we cannot call 
.unwrap_or() or similar fn's, we must pattern match *)

let rec sum_vals (lst: (string * string) list) (sum: int): int =
  match lst with 
  | [] -> sum
  | hd::tl -> 
      sum_vals tl sum + (
        match find_common_char hd with
            | None -> 0
            | Some ch -> match find_value ch with 
                        | Some n -> n
                        | None -> 0)
;;

let lines = split_lines (read_lines "input") [] in
let ans = sum_vals lines 0 in
print_int ans

(*^ Here would be our main fn, we are just calling the above functions, read_lines
is the helper fn that reads the file in to a list of strings *)

Now, I’m sure that was a mess, but I have only been messing with the ‘Caml for a couple weeks now. Please feel free to hit me up with corrections/pointers if you are an Ocaml wizard, I’d love to learn as much as I can. You’ll have to email me though, am I far too lazy to setup sqlite for comments.

I imagine that as I get more comfortable with the standard library I will end up using built in features where currently I might use recursion. However I am starting to enjoy the style and find it quite satisfying to produce an effective recursive solution.

I haven’t yet looked into the other common data structures either, so I am sure there are plenty of other improvements that can be made.

If you were unfamiliar with either language.. god help you, the whole post probably looked pretty scary depending on what background you have.

As I have learned a tiny bit more, it is definitely growing on me, and tasks I was tripping over even a few days ago are becoming a little more natural. The farther I get, the more I realize the similarities and as I go on, whatever advantage there might be from using a language that was heavily influenced by it, will probably start to show more and more.

I hope to provide (maybe more concise next time) more of my journey and thoughts on the ‘Caml, which I can say, I most definitely recommend to anyone who is interested in challenging their idea and perception of programming, while getting to learn a language with some very different syntax but also some great FP concepts/features that is actually used outside of Academia (Haskell we’re looking at you)

Note: This is meant to be a comparison to show people who are interested in learning 🐫 just some of the initial differences, and structure, to see what their experience might be like, as they begin to play around with the language. I try to learn every day, and the more that I learn, the more I realize that I am an expert at nothing.

Thanks for reading this, feel free to reach out to me if you are interested in discussing these topics, or if you have any corrections/pointers for me. I can be found on the eza matrix/gitter channel. or you can email me at p@eza.rocks

EDIT:

A solution from an actual OCaml programmer:

This excellent, concise solution was emailed to me by Yawar Amin https://github.com/yawaramin

(after finding him on github, it became immediately obvious that he is indeed, one of the before-mentioned OCaml wizards)

module SC = Set.Make(Char) (* Module for set of char *)

let prio ch =
  let code = Char.code ch in
  if code >= 97 then code - 96 (* a-z *)
  else code - 38 (* A-Z *)

let common string =
  let len = String.length string in
  let half_len = len / 2 in
  let first = SC.of_seq (String.to_seq (String.sub string 0 half_len)) in
  let second = SC.of_seq (String.to_seq (String.sub string half_len
half_len)) in
  let common_char = SC.inter first second in (* Set intersection *)
  assert (SC.cardinal common_char = 1); (* Assert set contains only
one member *)
  SC.min_elt common_char (* Min element must be the common character *)

let common_prio string = string |> common |> prio
let result list = List.fold_left (fun sum string -> sum + common_prio
string) list

This clearly takes advantage of data structures in the standard library that I have not yet explored. (Obviously it’s implementation of Set, but also seq): A seq or sequence is type of list that is lazily evaluated, increasing performance but also allowing for things like conceptually infinite sequences. I guess sequences are found in Haskell and F# (called a stream in Scala), and can be best compared to Iterators in Rust. This sending me down a pretty deep rabbit hole that could take up a whole if not several posts, learning about the use of lazy, seq, and iter, and the differences between them.

I think this highlights one of the hurdles of a new paradigm, is it forces you to almost hyper focus on trivial tasks, causing you to forget the bigger picture, and not reach for the tools you would in other languages.

Keeping with the nature of the post: I thought I would try to implement the exact same solution in Rust, mimicking it as closely as possible. (fn names and all)

fn prio(ch: char) -> i32 {
    let code = ch as u32;
    if code >= 97 {
        (code - 96) as i32 // a-z
    } else {
        (code - 38) as i32 // A-Z
    }
}

fn common(string: &str) -> char {
    let len = string.len();
    let half_len = len / 2;

    let first: HashSet<_> = string.chars().take(half_len).collect();
    let second: HashSet<_> = string.chars().skip(half_len).take(half_len).collect();

    let common_char: HashSet<_> = first.intersection(&second).cloned().collect();

    *common_char.iter().next().unwrap() // Min element must be the common character
}
// To demonstrate '|>'
fn common_prio(string: &str) -> i32 {
    prio(common(string))
}

fn result(list: Vec<&str>) -> i32 {
    list.iter().fold(0, |sum, string| sum + common_prio(string))
}

fn main() {
    let strings = fs::read_to_string("input").unwrap();
    let splitstr = strings.lines().collect::<Vec<_>>();
    let result_value = result(splitstr);
    println!("Result: {}", result_value);
}

Notice how awesome the |> operator is in OCaml (the “pipe” operator, as it’s so appropriately called), and what the same thing looks like in Rust. (Really any C based lang, but here in Rust, and unless you were to implement trait methods and then chain them).

Seeing the comparison really makes me appreciate the functional style even more. Big thank you to Yawar for the solution 👍.

2024 2
2023 6

2024

Confusing or misunderstood topics in systems programming: Part 1

13 minute read

Confusing or misunderstood topics in systems programming: Part 0

9 minute read

Processes, Pipes, I/O, Files and Threads/Async

Preston Thorpe

OCaml - Rust: A comparison of simple solutions

In Response to how well received my last post was,

The Problem:

The Basics:

Because I am an Ocaml n00b:

A solution from an actual OCaml programmer:

2024

Confusing or misunderstood topics in systems programming: Part 1

Confusing or misunderstood topics in systems programming: Part 0

2023

How I got here

Just learn everything

Gratitude

OCaml - Rust: A comparison of simple solutions

The hackers guide to the hackernews front page

OCaml: A Rust developer’s first impressions