Last news...

last news...

Trying OCR with GCloud Document AI

OCR stands for “Optical Character Recognition”, and is a powerful technique for extracting texts (and possibly also their position, fonts etc.) out of images. This task is far from being trivial, given all the possible fonts, colors, image qualitiesi out there. The text may also not lay on a horizontal straight line… Well you guess it, everything is possible in the wild, and the first step to make sense out of it is to extract the characters.

continue...

Overview of Smart Order Routing for Efficient and Compliant Trading

  • 05 Oct 2023
  • Manuel Capel
  • Tags: finance

Smart Order Routing (SOR) consists in processing orders on trading venues the most optimal way. On the surface, it’s pretty simple: as a broker-dealer you get from a customer a request to buy a certain amount of a company’s stock, you take a quick look at the different exchanges, maybe you will split this order if its volume is high, and that’s it. Well, not quite…

continue...

Build a Minimal Webserver in C

  • 05 Aug 2022
  • Manuel Capel
  • Tags: C web

Machines are already strong by themselves, but their potential gets truly unleashed when they can communicate with each other. Web servers enable this communication by taking requests from and sending back responses to other machines. Here we will see how to implement a very simple web server in C for Linux. You can find the complete code for it on this repo.

continue...

Create a complete instance on Google Cloud for ML

If the data volume to process exceeds the capacities of your local computer, it may be time to switch to a Google Cloud instance with more capacities.

continue...

Create a Python library

Creating a Python package isn’t typically something a developer does routinely, so when it happens, you may end up losing time and nerves in small details you forgot. This article will show you how to make your project pip install-able.

continue...

Hashing with Neural Network Weights

This articles presents a method for hashing strings based on the weights of an Artificial Neural Network trained on them.

continue...

Extended Game of Life

Invented by Conway in 1970, the game of life is a quite fascinating application of cellular automata with deep implications in computer science and mathematics. Let’s just see here how to make a pretty flexible multi-valued and multi-dimensional implementation of them leveraging numpy.

continue...

The Fastest Way to Compute the Fibonacci Sequence

What is the Fibonacci sequence? It’s easy to define: the first element is 1, the second is 2, and the following elements are the sum of the two previous ones: the 3rd element is 3 (2 +1), the 4th is 5 (3+2), the 5th is 8 (5+3) etc.

continue...

Gradient Descent Explained

Gradient descent is a major technique for training ML/DL models. Let’s have a closer look at it and implement a simple example from scratch in Python illustrating the main basic concepts around gradient descent.

continue...

CAST Model for Covid Prediction

Unless you spent these last few months in a cave in the end of the world (but then you’re probably not reading this article), you couldn’t escape the information about the Covid-19 pandemic. The difficulty to model its evolution is striking. In the US for example, a recent report of the CHS warned about this in a review of existing methods. They call therefore for the creation of a national outbreak science centre. Here is an article about CAST, an agent-based micro-simulation model I developed over the last few weeks as an attempt to male prediction about Corona spread.

continue...

Create a Progression Bar in Bash

We will show here how to create a bash script implementing a progression bar. Imagine you have a script creating automatically a bulk of images, this script takes time and while it is running you would like to know how far it is, how fast, and how long it still needs (easily adaptable for other cases like downloading script etc.). In brief, how to get something like that:

continue...

Cyclical learning rates with Tensorflow Implementation

The learning rate is considered as the most important hyperparameter in a neural network (Bengio 2012). Finding the right one is thus quite crucial. Even better is to find a good learning rate scheduling: modifying the learning rate during the training so that the model has a bigger chance to reach a better optimum. The goal of this article is to describe a learning rate scheduling that seems to work well, along its Tensorflow implementation and an example with a simple CNN on the MNIST dataset.

continue...

Transforming Keras Model into Tensorflow Estimator

A Tensorflow Estimator is a convenient object to manage models, especially for production. And Keras is a convenient library to build models. Thus combining both is a powerful way to leverage their strenghts. Especially since Keras will be the standard for building models in Tensorflow 2.0 Let see how it works:

continue...

What is a computer virus ?

  • 15 Mar 2019
  • Manuel Capel
  • Tags: virus

There seems to be a bit confusion about what is a computer virus, so here an article clarifying it hopefully. A computer virus is not necessarily a malware, and there are many other types of malwares than viruses. This is better summarized with a small schema:

continue...

Some Bash Tricks

To open this blog, a few bash tricks used to manage it, that could be useful by occasion on many other cases. This blog is runned with jekyll, and I created some scripts to create posts, and create/remove categories (tags) associated to posts.

continue...