A few things

twitter: @JosephErnest
email: here

Articles about:
#music
#photo
#programming

Don't read #tech articles except you really want to.

Some of my projects:
BigPicture
Jeux d'orgues
SamplerBox
Ojourdui

Low latency audio on a Windows PC with the built-in soundcard (bonus: it's multi-client!)

So you want to use your music production software, with low latency on your PC/Windows laptop?

You have basically two options:

ASIO4ALL is incredibly useful for the PC music community since more than 10 years, because it turns your cheap computer's built-in soundcard into a low-latency one! With ASIO4ALL, you can plug a MIDI keyboard and play piano or synth with no "delay". Without it, the delay of more than 50 ms between the keypress and the sound makes it nearly impossible to play.

But ASIO4ALL has one major drawback: it's not multi-client. This means that if your DAW is open with ASIO4ALL as sound driver, then, if you open:

... then it won't work: the audio is not available for them: your DAW and ASIO4ALL have locked your soundcard.

This is really annoying and I can't count how many hours of my life I wasted since 10 years to find a solution for this (every few months/years I retried and retried and benchmarked every new solution). (Ok switching to Mac would have been a faster solution...)

The real difficulty is that we would like to use

(*) A music software in ASIO + a standard application like Firefox using the so-called Windows WDM driver

Here is a list of things I tried, unsuccessfully:

Now, promising solutions:

Here is AsioLinkPro's clever idea: you still use ASIO4ALL as output, but this way (**):

Ableton Live (or any other DAW)    --> ASIO: AsioLinkPro                  \
                                                                            --- AsioLinkPro mixer --> ASIO4ALL
Firefox or Chrome or SoundForge    --> WDM: ASIOVADPRO virtual device     /
     or MP3 player                          (AsioLinkPro)

Clever, because even if there are 2 programs producing sound, AsioLinkPro is the only one which speaks directly with ASIO4ALL (which would not support 2 programs).

It must have been tricky to code it because it requires to code a "WDM virtual speaker device" Windows driver + an ASIO driver, phew!

Even if it's discontinued, at least it gives an idea about how to do it. Let's write such a minimalist open-source tool?

Note: not something very big and complex like Jack, but just a small WDM virtual speaker driver and an ASIO driver that both mix their content and send it to the ASIO4ALL output. (No GUI is even required).

An attempt to generate true entropy and random data with audio (and your computer's built-in microphone)

You probably know that generating some real random data is not so easy to do with a computer. How to design a good Random Number Generator (or a pseudo-random one) is a math topic that you can work years on ; it's also something very important for real-life applications such as security/cryptography, for example when you need to generate strong passwords.

Usually (and this is true in general in cryptography), designing your own algorithm is bad, because unless you're a professional in this subject and your algorithm has been approved by peers, you're guaranteed to have flaws in it, that could be exploited.

But here, for fun (don't use it for critical applications!), let's try to generate 100 MB of true random data.

1) Record 20 minutes of audio in 96khz 16bit mono with your computer's built-in microphone. Try to set the mic input level so that the average volume is neither 0 dB (saturation) nor -60 dB (too quiet). Something around -10 dB looks good. What kind of audio should you record? Nothing special, just the noise in your room is ok. You will get around 20*60*96000*2 ~ 220 MB of data. In these 220 MB, only the half will be really useful (because many values in the signal - an array of 16-bit integers - won't use the full 16-bit amplitude: many integers "encoding" the signal might be for example of absolute value < 1024, i.e. will provide only 10 bits)

2) Now let's shuffle these millions of bits of data with some Python code:

from scipy.io import wavfile
import numpy as np
import functools

sr, x = wavfile.read('sound.wav')  # read a mono audio file, recorded with your computer's built-in microphone

#### GET A LIST OF ALL THE BITS
L = []  # list of bits
for i in range(len(x)):
    bits = format(abs(x[i]), "b")  # get binary representation of the data
                                   # don't use "016b" format because it would create a bias: small integers (those not using
                                   # the full bit 16-bit amplitude) would have many leading 0s!
    L += map(int, bits)[1:]        # discard the first bit, which is always 1!

print L.count(1)
print L.count(0)  # check if it's equidistributed in 0s and 1s

n = 2 ** int(np.log2(len(L)))
L = L[:n]  # crop the array of bits so that the length is a power of 2; well the only requirement is that len(L) is coprime with p (see below)

### RECREATE A NEW BINARY FILE WITH ALL THESE BITS (SHUFFLED)
# The trick is: don't use **consecutive bits**, as it would recreate something close to the input audio data. 
# Let's take one bit every 96263 bits instead! Why 96263? Because it's a prime number, then we are guaranteed that
# 0 * 96263 mod n, 1 * 96263 mod n, 2 * 96263 mod n, ..., (n-1) * 96263 mod n will cover [0, 1, ..., n-1].  (**)
# This is true since 96263 is coprime with n. In math language: 96253 is a "generator" of (Z/nZ, +).

p = 96263  # The higher this prime number, the better the shuffling of the bits! 
           # If you have at least one minute of audio, you have at least 45 millions of useful bits already, 
           # so you could take p = 41716139 (just a random prime number I like around 40M)

M = set()
with open('truerandom', 'wb') as f:
    for i in range(0, n, 8):
        M.update(set([(k * p) % n for k in range(i, i+8)]))  # this is optional, here just to prove that our math claim (**) is true
        c = [L[(k * p) % n] for k in range(i, i+8)]   # take 8 bits, in shuffled order
        char = chr(functools.reduce(lambda a, b: a * 2 + b, c))  # create a char with it
        f.write(char)

print M  == set(range(n))  # True, this shows that the assertion (**) before is true. Math rulez!

Done, your truerandom file should be truly random data!

Notes:

How to create symbolic links with Windows with a GUI (no command-line)?

Quick tip: here is how to create symlinks in Windows without using the command line tool mklink.

Create symbolic links with a GUI

1) If you have Python installed, create mklinkgui.py:

import win32clipboard    # pip install pywin32 if you haven't installed it already
import sys, os, subprocess
fname = sys.argv[1]
win32clipboard.OpenClipboard()
filenames = win32clipboard.GetClipboardData(win32clipboard.CF_HDROP)
win32clipboard.CloseClipboard()
for filename in filenames:
    base = os.path.basename(filename)
    link = os.path.join(fname, base)
    subprocess.Popen('mklink %s "%s" "%s"' % ('/d' if os.path.isdir(filename) else '', link, filename), shell=True)

2) Open regedit and

How to use it?

How to find seamless loops in audio files (with a little bit of math and programming)

When making instrument sample sets (e.g. church organ sample sets used with Hauptwerk or GrandOrgue, see my project Jeux d'orgues), we need to set looping points in WAV audio files:

such that when playing the part [a, b] in loop, we don't hear any click or pop when the sample reaches the end of the loop.

Example 1: bad loop with audible clicks

Example 2: seamless loop with no click, that's what we are looking for! The loop has a ~ 2.670 second period, can you hear where are the looping points?

 

Finding looping points can be done manually but this is a very long and tedious task. A few programs exist to do this process automatically such as Extreme Sample Converter (it has an excellent auto-looping algorithm), LoopAuditioneer (open source), Zero-X Seamless Looper, SampleLooper, etc.

Here we'll look at a home-cooked algorithm that works well to detect looping points.

First of all, let's load the audio file (downloadable here) with Python:

from scipy.io import wavfile
import numpy as np
import itertools

sr, x = wavfile.read('060.wav')
x0 = x if x.ndim == 1 else x[:, 0]     # let's keep only 1 channel for simplicity, but we could easily generalize this for 2 channels
x0 = np.asarray(x0, dtype=np.float32)

Let's say the audio file's sustain part (this is precisely where we're looking for a loop!) begins at t=2 sec and finishes at t=9 sec. We will now subdivide the time-interval [2 sec, 9 sec] into a 250 milliseconds grid: 2, 2.25, 2.5, 2.75, 3, 3.25, ..., 8.75, 9.

From this sequence, we now create "loop candidates" (a, b) of length at least 1 second, example: (2.5, 7.5), (3.25, 5.75), (6.0, 8.75), etc. Then, for each loop candidate, we'll improve the loop (this is the core of the algorithm, it will be discussed in the next paragraph) and compute a distance d. We finally keep the loop that has the minimal distance (among all loop candidates). Finished!

A = [int((2 + 0.25 * k) * sr) for k in range(29)]  # the grid 2, 2.25, 2.5, ... 8.75, 9
dist = np.inf
for a, b in itertools.product(A, A):  # cartesian product: pairs (a, b) of points on the grid
    if b - a < 1 * sr:
        continue
    a, B, d = improveloop(x0, a, b, sr=sr)
    print 'Loop (%.3fs, %.3fs) improved to (%.3fs, %.3fs), distance: %i' % (a * 1.0 / sr, b * 1.0 / sr, a * 1.0 / sr, B * 1.0 / sr, d)

    if d < dist:
        aa = a
        BB = B
        dist = d 

print "The final loop is (%.3fs, %.3fs), i.e. (%i, %i)." % (aa * 1.0 / sr, BB * 1.0 / sr, aa, BB)

Finished? Not yet! We need to explain what we mean by improving a loop, as that's the crucial part of the algorithm. More precisely, we'll now explain how to transform a loop (3.25, 5.75) with points taken on the grid (this random loop probably "clicks" like in Example 1 before!) into a "good loop" (3.25, 5.831). Let's zoom on the junction point to understand what's going on:

How to measure if a loop is good or not? Ideally, if the loop (a, b) is perfect/seamless, x[a:a+10 ms] should be very close to x[b:b+10 ms]. Measuring how close two arrays x and y are can be done by computing sum((x[n]-y[n])^2), and if the sum is small, x and y are close.

Finding k such that np.sum(np.abs(x0[a:a+W1]-x0[k+b:k+b+W1])**2) is minimal can be obtained by noting that

(x[n] - y[n+k])**2  = x[n]**2 - 2*x[n]*y[n+k] + y[n+k]**2

and by using numpy.correlate. We can now define this function:

def improveloop(x0, a, b, sr=44100, w1=0.010, w2=0.100):
    """
    Input:  (a, b) is a loop
    Output: (a, B) is a better loop 
            distance (the less the distance the better the loop)
    This function moves the loop's endpoint b to B (up to 100 ms further) such that (a, B) is a "better" loop, i.e. sum((x0[a:a+10ms] - x0[B:B+10ms])^2) is minimal
    """

    W1 = int(w1*sr)
    W2 = int(w2*sr)
    x = x0[a:a+W1]
    y = x0[b:b+W2]
    delta = np.sum(x**2) - 2*np.correlate(y, x) + np.correlate(y**2, np.ones_like(x))
    K = np.argmin(delta)
    B = K + b
    distance = delta[K]

    return a, B, distance

That's it, in less than 50 lines of Python code!

This audio file

(looped 4 times here but we could loop it forever) has been obtained with the algorithm described here. Not too bad, n'est-ce pas?


Example of output:

Loop (2.000s, 3.000s) improved to (2.000s, 3.009s), distance: 1003724800
Loop (2.000s, 3.250s) improved to (2.000s, 3.340s), distance: 839278592
Loop (2.000s, 3.500s) improved to (2.000s, 3.559s), distance: 1281863680
[...]
Loop (2.000s, 8.500s) improved to (2.000s, 8.544s), distance: 1092337664
Loop (2.000s, 8.750s) improved to (2.000s, 8.789s), distance: 964747264
Loop (2.000s, 9.000s) improved to (2.000s, 9.004s), distance: 2488913920
[...]
Loop (7.750s, 9.000s) improved to (7.750s, 9.004s), distance: 1167093760
Loop (8.000s, 9.000s) improved to (8.000s, 9.001s), distance: 1710333952

The final loop is (6.750s, 8.322s), i.e. (297675, 366989).

Note: Wouldn't it be possible to save these loop markers inside the WAV file's metadata instead of just printing them on screen? Sure it is, but as Python's standard library doesn't support WAV markers editing, you'll have to use these techniques to do this.

Pour en finir avec les préfaces (postfaçons-les!)

Figure 1: Version DIY-postfacée des Confessions de Rousseau.

 

Par quelle prétention, oui prétention, l'éditeur ou le "préfaceur" peut-il placer ses propres mots – une vingtaine de pages explicatives, souvent utiles a posteriori après lecture du texte principal, mais peu éclairantes avant la lecture du texte – en début de livre, avant même les mots de l'auteur ?

Par quel mystère est-il devenu la norme pour les livres de poche de devoir feuilleter jusqu'à la page 35 pour enfin pouvoir lire les premières lignes du texte original de l'auteur ?

Pour illustrer mon propos, je ne vois pas en quoi les premières lignes des Confessions de Rousseau:

Voici le seul portrait d'homme, peint exactement d'après nature et dans toute sa vérité, qui existe et qui probablement existera jamais.
[...]
Je forme une entreprise qui n'eut jamais d'exemple et dont l'exécution n'aura point d'imitateur. Je veux montrer à mes semblables un homme dans toute la vérité de la nature ; et cet homme ce sera moi.

ne se suffiraient pas à elles-mêmes et nécessiteraient d'être précédées par cette préface:

Ce livre, pour son auteur comme pour nous, est d'abord un acte: confessions, non mémoires, même si la scansion du récit s'appuie sur une trame chronologique; appel à l'autre, appel séducteur et pathétique, qui suscite en alternance chez le lecteur intimité complice et mise à distance irritée, non recherche du temps perdu ; apologie et non bilan ; ...

et une vingtaine de pages du même acabit ?

Hormis rares cas (notamment préface choisie en accord avec l'auteur au moment de l'édition, ce qui n'est clairement pas le cas ici pour Rousseau), je pense n'avoir jamais lu une préface d'une œuvre classique qui fasse réellement sens avant d'avoir lu le texte de l'auteur. La quasi-totalité des préfaces de textes classiques apportent un éclairage supplémentaire, historique, des éléments de contexte qui sont intéressants après lecture du corps du texte, mais pourquoi placer ces informations avant ? Dans le cas illustré précédemment, cette préface de J. B. Pontalis aurait certainement toute sa place dans cet ouvrage, mais en tant que postface.

Si on pousse le raisonnement plus loin, on pourrait presque arriver à la conclusion que l'éditeur estime que l'auteur n'a pas suffisament donné d'éléments de contexte pour que le texte puisse se suffire à lui-même et donc qu'il faudrait "corriger" cela grâce des explications préalables pour pouvoir appréhender le texte. C'est pourtant le choix de l'auteur de débuter son texte comme il l'a fait, pourquoi estimer qu'il faut l'expliquer au prélalble comme si ses mots étaient insuffisants ?

Les préfaces sont aussi une aberration d'un point de vue pédagogique : on a tous croisé au lycée des personnes (pas si rares) qui se targuaient de ne pas avoir lu le livre sur lequel on était censé travailler:

— Tu l'as lu toi?
— Non j'l'ai pas lu!

Pour quelqu'un pas franchement branché lecture, réussir à trouver le début du texte original de l'auteur n'est pas une mince affaire. Je suis prêt à parier que bon nombre de collégiens sont passés à côté d'une pièce de Molière parce qu'ils se sont perdus ou endormis dans les quinze premières pages d'élucubrations d'un quelconque inspecteur général (sans savoir qu'ils ne lisaient en fait pas du Molière). Je suis prêt à parier aussi que le taux de lecture, dans une classe de 3ème, d'une pièce de Molière serait bien supérieur si la pièce démarrait en page 3 et non pas en page 33 !

Quel meilleur moyen de donner éclat à un texte d'auteur si ce n'est de lui permettre d'apparaître noir sur blanc dès la première ou deuxième page après avoir ouvert l'ouvrage ?

De grâce, chers éditeurs, postfacez vos préfaces.

Un Canapé à Orléans - Expérience photographique aléatoire

Un jour de 2013, en rentrant en tram de l'université, je passe à côté d'un vieux canapé, abandonné, au coin de la rue d'Alibert. J'étais dans une période résolument "photographique" : en tant que nouvel orléanais, quoi de mieux que la Street Photography pour découvrir une ville et ses habitants ? Je décide donc de chercher mon appareil, et dix minutes plus tard, me voilà installé à attendre les passants et leur suggérer de prendre la pose (pause ?) sur ce canapé.

A ma grande surprise, les personnes étaient majoritairement volontaires pour se préter au jeu. La question "Mais dans quel but ?" avait pour réponse "Simplement pour une photo, et éventuellement pour une expo", chose qui fut réalisée au festival Les Ingrédients l'année suivante et au bar L'Escargot.

Au fil des jours, je déposais donc les nouveaux clichés sur la page Un Canapé A Orléans, et de nouvelles personnes arrivaient – soit de simples passants ou des gens qui avaient vu la page sur internet.

Merci aux personnes du service voirie de la ville qui ont accepté de retarder l'enlèvement du-dit canapé de quelques jours, à la République du Centre pour leurs articles, à la parution Publicités sur canapé du magazine Arrêt sur Images, et bien sûr, à toutes les personnes photographiées!

“Marianna & Joseph - Run, Hayley, Run”

Voici une chanson écrite avec Marianna Kosch, enregistrée à Paris en 2017. Ambiance 80s synthpop!

Many thanks to Hayley Connaughton for the artwork.

Working with audio files in Python, advanced use cases (24-bit, 32-bit, cue and loop markers, etc.)

Python comes with the built-in wave module and for most use cases, it's enough to read and write .wav audio files.

But in some cases, you need to be able to work with 24 or 32-bit audio files, to read cue markers, loop markers or other metadata (required for example when designing a sampler software). As I needed this for various projects such as SamplerBox, here are some contributions I made:

  1. The Python standard library's wave module doesn't read cue markers and doesn't support 24-bit files. Here is an updated module:

    wave.py (enhanced)

    that adds some little useful things. (See Revision #1 to see diff with the original stdlib code).

    Usage example:

    from wave import open
    
    f = open('Take1.wav')
    print(f.getmarkers())

    If you're familiar with main Python repositery contributions (I'm not), feel free to include these additions there.

  2. The module scipy.io.wavfile is very useful too. So here is an enhanced version:

    wavfile.py (enhanced)

    Among other things, it adds 24-bit and 32-bit IEEE support, cue marker & cue marker labels support, pitch metadata, etc.

    Usage example:

    from wavfile import read, write
    
    (sr, samples, br, cue, cuelabels, cuelist, loops, f0) = read('Take1.wav', readmarkers=True, readmarkerlabels=True, readmarkerslist=True, readpitch=True, readloops=True)
    print read('Take1.wav', readmarkers=True, readmarkerlabels=True, readmarkerslist=True, readpitch=True, readloops=True)
    
    write('Take2.wav', sr, samples, bitrate=br, markers=cue, loops=loops, pitch=130.82)
    print read('Take2.wav', readmarkers=True, readmarkerlabels=True, readmarkerslist=True, readpitch=True, readloops=True)
    
    write('Take3.wav', sr, samples, bitrate=br, markers=cuelist, loops=loops, pitch=130.82)

    Here is a Github post and pull-request about a hypothetical merge to Scipy.

Here is how loop markers look like in the good old (non open-source but soooo useful) SoundForge:


Lastly, this is how to convert a WAV to MP3 with pydub, for future reference. As usual, do pip install pydub and make sure ffmpeg is in the system path. Then:

from pydub import AudioSegment
song = AudioSegment.from_wav("test.wav")
song.export("test.mp3", format="mp3", bitrate="256k")

will convert a WAV file to MP3.

Make a zooming + panning user interface work on mobile devices (in progress)

What's cool with Zooming User Interfaces is that you have always free space available anywhere (either by zooming or panning) to write new ideas.

That was the key idea in 2014 when creating BigPicture (ready-to-use infinite notepad in-the-cloud) and the open-source JavaScript library bigpicture.js powering it:

It works as expected on desktop browsers. Now, the next big challenge is: how to make it work on mobile devices?

It's funny to even have to ask this question, since touch devices are natively made to do panning (slide finger on screen) and zooming (pinch with 2 fingers). So it should be straightforward to adapt BigPicture to mobile devices.

However here are the difficulties:

  1. The transform/scale from CSS has limitations (probably max 10x or 100x factor when I started this project a few years ago), so we can't only use this to do a (nearly) infinite zooming user interface

  2. It requires to be able to zoom on a particular part of the viewport and not zoom the other parts of the HTML document (e.g. a top navigation header). Here are many potential solutions:

  3. Possible useful tools for this:

    • Zoomooz (however, I read in comments: Zoomooz does not support multi-touch pinching events. Its only a library for zooming into elements on a page, but has no support for pinching behavior, so far as I can see in the documentation.)

    • Hammer.js

    • ZUI53

    • TouchSwipe, a jQuery plugin for touch devices

Work in progress!

By the way, here is how to simulate touch events on Chrome for desktop computer: open the Developer console (F12), then there's a top-left button "Toggle device toolbar" (CTRL+SHIFT+M), here you go! For pinch-zoom events, use SHIFT + click + mouse up.

Your tests / pull requests / help to build a mobile version are welcome on this branch: https://github.com/josephernest/bigpicture.js/tree/mobile!

If you really like that open-source project, you can donate here: 1NkhiexP8NgKadN7yPrKg26Y4DN4hTsXbz.

Writing, a text-editor in the browser

Since I've started using StackOverflow, I've always loved their text editor (the one you use when writing a question/answer), because it supports Markdown syntax (a very elegant markup language to add bold, italic, titles, links, itemization, etc.), and even MathJax (which is more or less LaTeX syntax in the browser). I've always wanted to use such an editor for my own documents.

After some research, I found a few existing tools, but:

Let's go and actual build one! Here is the result, Writing:

Here's the source: https://github.com/josephernest/writing

For sure you'll like it!

If you really like that, you can donate here: 1NkhiexP8NgKadN7yPrKg26Y4DN4hTsXbz

Older articles