sunnuntai 18. helmikuuta 2018

Projects and features Meson could use help with

A question I was asked during my LCA2018 presentation was how people could help the Meson project. I could not come up with proper projects off the cuff, so here are a bunch of things that have come up since. Feel free to contact us via IRC, email or any other medium if you wish to contribute.

WrapDB wrangler

WrapDB provides a simple way to download source dependencies automatically. Basically it takes an upstream release tarball, adds Meson build files to it if needed and publishes the result on the web. The work consists mostly of reviewing and merging submissions from the community. Creating your own is also fine. This is a fairly lightweight task, only requiring actions every now and then (submissions come less than once a week, typically).

CI fixer upper

For CI we use the free tiers of Travis and Appveyor. This works fairly well but it is very slow because our testing matrix is huge. Running the full test suite through AppVeyor takes about an hour. This slows us down a fair bit and in addition both CI providers have a nasty habit of breaking down fairly often. We also don't want to do priced tiers because they get ridiculously expensive for our usage pattern (as in, a few months of paid for macOS would cost more than a brand new Mac Mini).

We don't have any good ideas on how to make this better. If you do, let us know.

Large scale regression tester

Meson is being used by a fairly large number of projects. This makes fixing bugs and refactoring code challenging because there is the possibility of regressions. It would be nice if we could do something similar to Rust developers and rebuild all or a large fraction of projects using Meson with the trunk version every now and then.

XCode backend improvements

The XCode backend is currently a bit crappy. The main reason for this is that the XCode project file format is awful in many ways. The two main reasons being that it is completely undocumented and the fact that it is not really a file format as such, but more of a memory dump of XCode's internal data structures. But if you are the sort of person who enjoys a challenge of battling against windmills, this might be for you.

Meson build file rewriter

Integration with IDEs and the like is important and we want to provide tools for operations such as "add source file X to target Y" so everyone does not have to write their own implementations. There is actually code for this in trunk but it is quite limited and has bitrotted a fair bit. Resurrecting and making the code actually work would be very welcome.

Introspect improvements

This one also aims to improve the IDE integration features of Meson. As an example you can only get information about build targets one by one. This means that getting the information from a project that has thousands of targets takes forever. We really need a batch exporter so IDEs can grab all necessary project information in one go. There are probably a bunch of other things to improve as well.

Could these be done as part of gsoc/outreachy/other?

Possibly. Meson is not really an "entity" in the gsoc sense but we could potentially get something accepted under the Gnome umbrella. However anyone is welcome to submit patches, obviously, and several of the topics listed above are not nicely self-contained projects that would fit in the gsoc mold at all.

perjantai 16. helmikuuta 2018

Automatically finding slow headers in C++ projects

A common problem in older C++ codebases is that sources compile slowly due to massive header includes. Headers include other headers, which include even more headers and then, somewhere in the guts of the system, someone includes a header that is very slow to parse. Now things are slow and nobody really knows why.

Trawling through the header soup manually is not feasible. Even if you were to manually inspect the headers, it is difficult to know which are the slow ones. Educated guesses can be made, such as anything having the word "boost" in its name is slow, but this only gets you so far. Fortunately it turns out that it is fairly straightforward to write a tool to find the slow ones automatically.

We need two things to be able to reliably measure the inclusion time breakdown of the headers of any source file.

  1. The transitive list of all header files it includes.
  2. The exact compiler flags used to compile the source.
The former can be obtained from a dependency file that the compiler can be told to generate during compilation (and which almost all modern build systems use by default). The latter can be obtained from the compilation_commands database which is also generated by most build tools today. The actual algorithm is simple: for each dependency header, create a dummy cpp file that just #includes that header, compile the source and measure the time it took.

I created a repo with the measurement script and a sample project to test it on. It has one source file and a few internal headers that include external headers. Here's the top part of its output:

0.5875 ../h1.h
0.5254 /usr/include/c++/7/regex
0.2779 /usr/include/c++/7/shared_mutex
0.2747 /usr/include/c++/7/condition_variable
0.2685 ../h2.h
0.2563 /usr/include/c++/7/locale
0.2445 /usr/include/c++/7/sstream
0.2337 ../h3.h
0.2330 /usr/include/c++/7/iostream
0.2329 /usr/include/c++/7/istream

Iostream has been traditionally considered to be big, bloated and slow to compile. However in this simple example we find that shared_mutex is even slower.

There are, of course, many caveats with this method. The main one being that this does not measure the code generation time, only parsing time. These two are usually highly correlated, though.

keskiviikko 14. helmikuuta 2018

Meson's dependency manager in action building GTK

One of the greatest things about creating software is seeing other people pick it up and run with it. Here is a great example of GTK's new development experience using Meson subprojects to automatically obtain dependencies.

It is easy to see how this makes it easier for newcomers to participate. There are no longer pages upon pages of instructions on how to set up a build environment and so on. All that is required is to clone one Git repo and start building. The build system will take care of all the rest.

The eventual goal is to be able to build the entire stack fully from scratch on any platform, even Windows with the Visual Studio compiler. Unfortunately there are still a few missing features but we'll get them added at some point.

perjantai 9. helmikuuta 2018

Looking inside a Linux powered slot machine

In my day job I work as a consultant. This means that I get to see all kinds of interesting things. One of them is this piece of hardware here:


This is a slot machine as operated by Veikkaus, which is the state run corporation operating all gambling services in Finland. There are roughly 20 000 slot machines in use in Finland currently. This is interesting on its own, but things get really fun when you look on the inside.


A fair fraction of the insides is taken by machinery that deals with coins. When a coin is inserted in the machine it first goes in the coin acceptor, which is marked with a green box in the image. It detects the type of the coin. Each denomination has its own exit chute. Bad coins are rejected from the machine while sorted coins get passed into coin hoppers (marked in red).

A coin hopper is basically a bowl of coins and a mechanism that is cabable of ejecting coins from it one by one. When you think of slot machines, you are probably thinking of the sound they make when start spitting out tons of coins after a jackpot. Coin hoppers are what create that particular sound. I recommend looking up videos on Youtube if you are interested in mechanical engineering, because the way they work is kind of fascinating.

The slot machine also accepts notes and debit card payments but these are mechanically much simpler and don't take much space. The only thing remaining in the picture is the box marked in yellow. It contains the actual brains of the entire machine.

The contents of the brain

The main system is, much like everything these days, a regular computer. This specific one is a fairly average industrial PC that is running a custom version of Debian. At boot it starts up the game software that is based on a custom version of the Ogre 3D graphics engine. The computer also manages and controls all other hardware in the cabinet, such as the coin hoppers and note acceptor mentioned above, using a custom, self designed controller board. The cabinet housing the device is also custom designed and built.

Thus, surprisingly, at its core a slot machine is roughly the same as a desktop PC running desktop games with a few extra peripherals. This means is that Linux desktop gaming has been mainstream among the general Finnish population for 15 years, which is roughly the amount of time these slot machines have been deployed.

In addition to the games themselves, the development environment is also 100% Linux. As a demonstration, here is a screen shot of a development version of the software running on a developer workstation.

What about the money?

Like all forms of gambling, slot machines make quite a lot of money. The yearly profits, as of last count, were on the order of 500 million euros per year. As Veikkaus is a government run business, this money is given out to various charitable organisations as well as to the state. Given that Finland's yearly budget is on the order of 50 billion euros, this means that profits from Linux desktop gaming account for almost 1% of the entire budget of the state of Finland.

Acknowledgements

Thanks to Veikkaus for giving me permission to write this blog post. Extra special thanks for allowing to show the picture of the insides of a slot machine, which has never before been shown in public.

sunnuntai 31. joulukuuta 2017

These three things could improve the Linux development experience dramatically, #2 will surprise you

The development experience on a modern Linux system is fairly good, however there are several strange things, mostly due to legacy things no longer relevant, that cause weird bugs, hassles and other problems. Here are three suggestions for improvement:

1. Get rid of global state

There is a surprisingly large amount of global (mutable) state everywhere. There are also many places where said global state is altered in secret. As an example let's look at pkg-config files. If you have installed some package in a temporary location and request its linker flags with pkg-config --libs foo, you get out something like this:

-L/opt/lib -lfoo

The semantic meaning of these flags is "link against libfoo.so that is in /opt/lib". But that is not what these flags do. What they actually mean is "add /opt/lib to the global link library search path, then search for foo in all search paths". This has two problems. First of all, the linker might, or might not, use the library file in /opt/lib. Depending on other linker flags, it might find it somewhere else. But the bigger problem is that the -L option remains in effect after this. Any library search later might pick up libraries in /opt/lib that it should not have. Most of the time things work. Every now and then they break. This is what happens when you fiddle with global state.

The fix to this is fairly simple and requires only changing the pkg-config file generator so it outputs the following for --libs foo:

/opt/lib/libfoo.so

2. Get rid of -lm, -pthread et al

Back when C was first created, libc had very little functionality in it. Because of reasons, new functionality was added it went to its own library that you could then enable with a linker flag. Examples include -lm to add the math library and -ldl to get dlopen and friends. Similarly when threads appeared, each compiler had its own way of enabling them, and eventually any compiler not using -pthread died out.

If you look at the compiler flags in most projects there are a ton of gymnastics for adding all these flags not only to compiler flags but also to things like .pc files. And then there is code to take these flags out again when e.g. compiling on Visual Studio. And don't even get me started on related things like ltdl.

All of this is just pointless busywork. There is no reason all these could not be in libc proper and available and used always. It is unlikely that math libraries or threads are going to go away any time soon. In fact this has already been done in pretty much any library that is not glibc. VS has these by default, as does OSX, the BSDs and even alternative Linux libcs. The good thing is that Glibc maintainers are already in the process of doing this transition. Soon all of this pointless flag juggling will go away.

3. Get rid of 70s memory optimizations

Let's assume you are building an executable and that your project has two internal helper libraries. First you do this:

gcc -o myexe myexe.o lib1.a lib2.a

This gives you a linker error due to lib2 missing some symbols that are in lib1. To fix this you try:

gcc -o myexe myexe.o lib2.a lib1.a

But now you get missing symbols in lib1. The helper libraries have a circular dependency so you need to do this:

gcc -o myexe myexe.o lib1.a lib2.a lib1.a

Yes, you do need to define lib1 twice. The reason for this lies in the fact that in the 70s memory was limited. The linker goes through the libraries one by one. When it process a static library, it copies all symbols that are listed as missing and then throws away the rest. Thus if lib2 requires any symbol that myexe.o did not refer to, tough luck, all those symbols are gone. The only way to access them is to add lib1 to the linker line and have it processed in full for a second time.

This simple issue can be fixed by hand but things get more complicated if the come from external dependencies. The correct fix for this would be to change the linker to behave roughly like this:
  • Go through the entire linker line and find all libraries.
  • Look which point to same physical files and deduplicate them
  • Wrap all of these in a single -Wl,--start-group -Wl,--end-group
  • Do symbol lookup once in a global context
This is a fair bit of work and may cause some breakage. On the other hand we do know that this works because many linkers already do this, for example Visual Studio and LLVM's new lld linker.

tiistai 26. joulukuuta 2017

Creating an USB image that boots to a single GUI app from scratch

Every now and than you might want or need to create a custom Linux install that boots from a USB stick, starts a single GUI application and keeps running that until the user turns off the power. As an example at a former workplace I created an application for downloading firmware images from an internal server and flashing those. The idea there was that even non-technical people could walk up to the computer, plug in their device via USB and push a button to get it flashed.

Creating your own image based on latest stable turns out to be relatively straightforward, though there are a few pitfalls. The steps are roughly the following:
  1. Create a Debian boostrap install
  2. Add dependencies of your program and things like X, Network Manager etc
  3. Install your program
  4. Configure the system to automatically login root on boot
  5. Configure root to start X upon login (but only on virtual terminal 1)
  6. Create an .xinitrc to start your application upon X startup
Information on creating a bootable Debian live image can easily be found on the Internet. Unfortunately information on setting up the boot process is not as easy to find, but is instead scattered all over the place. A lot of documentation still refers to the sysvinit way of doing things that won't work with systemd. Rather than try to write yet another blog post on the subject I instead created a script to do all that automatically. The code is available in this Github repo. It's roughly 250 lines of Python.

Using it is simple: insert a fresh USB stick in the machine and see what device name it is assigned to. Let's assume it is /dev/sdd. Then run the installer:

sudo ./createimage.py /dev/sdd

Once the process is complete and you can boot any computer with the USB stick to see this:


This may not look like much but the text in the top left corner is in fact a PyGTK program. The entire thing fits in a 226 MB squashfs image and takes only a few minutes to create from scratch. Expanding the program to have the functionality you want is then straightforward. The Debian base image takes care of all the difficult things like hardware autodetection, network configuration and so on.

Problems and points of improvement

The biggest problem is that when booted like this the mouse cursor is invisible. I don't know why. All I could find were other people asking about the same issue but no answers. If someone knows how to fix this, patches are welcome.

The setup causes the root user to autologin on all virtual terminals, not just #1.

If you need to run stuff like PulseAudio or any other thing that requires a full session, you'll probably need to install a full DE session and use its kiosk mode.

This setup runs as root. This may be good. It may be bad. It depends on your use case.

For more complex apps you'd probably want to create a DEB package and use it to install dependencies rather than hardcoding the list in the script as is done currently.

lauantai 23. joulukuuta 2017

"A simple makefile" is a unicorn

Whenever there is a discussion online about the tools to build software, there is always That One Person that shows up and claims that all build tools are useless bloated junk and that you should "just write a simple Makefile" because that is lean, efficient, portable and does everything anyone could ever want.

Like every sentence that has the word "just", this is at best horribly simplistic but mostly plain wrong. Let's dive in more detail into this. If you look up simple Makefiles on the Internet, you might find something like this page. It starts with a very simple (but useless) Makefile and eventually improves it to this:

IDIR =../include
CC=gcc
CFLAGS=-I$(IDIR)

ODIR=obj
LDIR =../lib

LIBS=-lm

_DEPS = hellomake.h
DEPS = $(patsubst %,$(IDIR)/%,$(_DEPS))

_OBJ = hellomake.o hellofunc.o 
OBJ = $(patsubst %,$(ODIR)/%,$(_OBJ))


$(ODIR)/%.o: %.c $(DEPS)
$(CC) -c -o $@ $< $(CFLAGS)

hellomake: $(OBJ)
gcc -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
rm -f $(ODIR)/*.o *~ core $(INCDIR)/*~ 

Calling this "simple" is a bit of a stretch. This snippet contains four different kinds of magic expansion variables, calls three external commands (two of which are gcc, just with different ways) and one Make's internal command (bonus question: is patsubst a GNU extension or is it available in BSD Make? what about NMake?) and requires the understanding of shell syntax. It is arguable whether this could be called "simple", especially for newcomers. But even so, this is completely broken and unreliable.

As an example, if you change any header files used by the sources, the system will not rebuild the targets. To fix these issues you need to write more Make. Maybe something like this example, described as A Super-Simple Makefile for Medium-Sized C/C++ Projects:

TARGET_EXEC ?= a.out

BUILD_DIR ?= ./build
SRC_DIRS ?= ./src

SRCS := $(shell find $(SRC_DIRS) -name *.cpp -or -name *.c -or -name *.s)
OBJS := $(SRCS:%=$(BUILD_DIR)/%.o)
DEPS := $(OBJS:.o=.d)

INC_DIRS := $(shell find $(SRC_DIRS) -type d)
INC_FLAGS := $(addprefix -I,$(INC_DIRS))

CPPFLAGS ?= $(INC_FLAGS) -MMD -MP

$(BUILD_DIR)/$(TARGET_EXEC): $(OBJS)
$(CC) $(OBJS) -o $@ $(LDFLAGS)

# assembly
$(BUILD_DIR)/%.s.o: %.s
$(MKDIR_P) $(dir $@)
$(AS) $(ASFLAGS) -c $< -o $@

# c source
$(BUILD_DIR)/%.c.o: %.c
$(MKDIR_P) $(dir $@)
$(CC) $(CPPFLAGS) $(CFLAGS) -c $< -o $@

# c++ source
$(BUILD_DIR)/%.cpp.o: %.cpp
$(MKDIR_P) $(dir $@)
$(CXX) $(CPPFLAGS) $(CXXFLAGS) -c $< -o $@


.PHONY: clean

clean:
$(RM) -r $(BUILD_DIR)

-include $(DEPS)

MKDIR_P ?= mkdir -p

It's unclear what the appropriate word to describe this thing is, but simple would not be at the top of the list for many people.

Even this improved version is broken and unreliable. The biggest issue is that changing compiler flags does not cause a recompile, only timestamps do. This is a common reason for silent build failures. It also does not provide for any way to configure the build depending on the OS in use. Other missing pieces that should be considered entry level features for build systems include:

  • No support for multiple build types (debug, optimized), changing build settings requires editing the Makefile
  • Output directory is hardcoded, you can't have many build directories with different setups
  • No install support
  • Does not work with Visual Studio
  • No unit testing support
  • No support for sanitizers apart from manually adding compiler arguments
  • No support for building shared libraries, apart from manually adding compiler arguments (remember to add -shared in your object file compile args ... or was it on link args ... or was it -fPIC)
  • No support for building static libraries at all
  • And so on and so on

As an example of a slightly more advanced feature, cross compilation is not supported at all.

These are all things you can add to this supposedly super simple Makefile, but the result will be a multi-hundred (thousand?) line monster of non-simplicityness.

Conclusions

Simple makefiles are a unicorn. A myth. They are figments of imagination that have not existed, do not exist and will never exist. Every single case of a supposedly simple Makefile has turned out to be a mule with a carrot glued to its forehead. The time has come to let this myth finally die.