Sandboxing data crunches, Chapter 2: clone processes

It costs 1–2 seconds between when Renderer calls subprocess.Popen() and when Step sandboxes itself. Importing Pandas, Numpy and Pyarrow takes too long.
clone() duplicates a process; execve() wipes its memory and makes it run something else
  1. clone() — duplicate all the RAM of the current process. Now there are two processes: the Renderer process and its near-exact clone, the Step process. The Renderer process continues; in it, clone()returns the Step process ID (“pid”). At the same time, the Step process continues at the exact same place; but there, clone() returns 0. Aside from that single integer difference, the two processes are identical.
  2. The process that judges itself to be a child (because clone() returned 0) calls execve("/path/to/program", …). This erases all the unwanted — and sensitive! — Renderer data from memory and starts the Step program.
Fast but unacceptable
This is essentially Python’s “multiprocessing.forkserver” design
  1. Renderer signals Spawner to invoke clone().
  2. Spawner clones itself to create a Step process, costing less than 1 millisecond. (Since Step is a clone of Spawner, its memory doesn’t contain secrets or user data.)
  3. The Step process stops behaving like the Spawner it cloned; it begins sandboxing instead. See — this Step process started instantly!
  4. Meanwhile, Spawner returns a subprocess handle to Renderer and then awaits Renderer’s next request. Spawner can do this all day….
clone(CLONE_PARENT) makes Step is a child of Renderer
Calling clone() from Python

--

--

--

Journalist, ex software engineer

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

List of hackathon tips. How we got $7250 bounties on ETHDenver.

Weighted Quick Union algorithm explained

The world wide web is like a bow tie: Discovering graph structure with Neo4j

Storage Classes in C

Automating automation: updating multiple Azure DevOps pipelines using Powershell scripting

AWS: Amazon Aurora 2nd Part

Easy way to add Custom Post Type and Custom Fields in Wordpress

Migrating Big Data in Moving Trains with AWS

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Hooper

Adam Hooper

Journalist, ex software engineer

More from Medium

LLDB Custom Data Formatters for C in Python

[Solved] Pytest Error: ImportError: Error importing plugin ‘’: No module named …

Graph Data Structure — Theory and Python Implementation

Reversing part of a string in Python