web analytics

How to reduce a library size using etrace

Libraries may include hundreds and even thousands of object files that are not actually used by your application. In this article, I will discuss a method for reducing library size by getting rid of code that you do not need in your application.

This kind of tasks is common for embedded system developers who need to make library (in case it's shared) small enough to fit in the embedded target platform which usually has very limited memory size.

For the purpose of reducing our library, we will use the etrace utility and shell scripts. We assume that the library comes with a makefile (lib-makefile) and that our working environment is linux/unix.

About the etrace utility

The etrace utility is a system that relies on special features of gcc to trace function calls in an executable during run time. The utility comes in two parts, a c file and a perl script. To generate function call trace, the c file must be compiled and linked with the executable. Then the perl script is used on the generated executable.

Steps to trimming un-necessary code

Create your executable and make sure it runs

First we need to make sure that our build environment is working properly. For that we need to build the library and the executable, then make sure that the executable runs properly. We also need this step to have a copy of the full library with all the objects and functions included. After creating the full version of the library, keep a copy of it and (we will refer to the full version of the library by 'full-version.a')

Add gcc debugging options

In order for the etrace utility to work, we need to enable special features in gcc. This can be done by providing the following options to gcc compile command (usually stored in the variable CC):

    -g -finstrument-functions -w

After downloading the etrace utility, unpack it and compile the etrace.c program to generate etrace.o, then link your executable against the etrace.o object.

Trace function calls in your executable

Now that our executable and library are ready, we can use the ptrace.pl utility to trace the function calls. For this we need to issue the following commands :

	touch TRACE
	ptrace.pl ./executable > log.txt

The file log.txt should contain the function call tree of your program (executable). The following is a sample output of the trace :

./etrace.pl program
\-- main
|   \-- app_get_custom_resource_list
|   \-- BIO_new_fp
|   |   \-- BIO_s_file
|   |   \-- BIO_new
|   |   |   \-- R_malloc
|   |   |   \-- BIO_set
|   |   |   |   \-- R_malloc
|   |   |   |   \-- BIO_get_bio_meth
|   |   |   |   \-- EX_DATA_new_ex_data
|   |   |   |   \-- file_new
|   |   \-- BIO_ctrl
|   |   |   \-- file_ctrl
|   |   |   |   \-- file_free
|   \-- BIO_new_fp
|   |   \-- BIO_s_file

Find non used objects

Now that we have the function call tree, we can findout which object files are needed and which object files we can remove from the library. For this we will use the trace in log.txt to find the object files where the functions that our program needs are defined. For each function in log.txt, we will look in the library for objects that contain the code for that function. The following script generates the list of objects we are looking for :

	cat log.txt | sed 's/\s*//g' | sed 's/\\//g' | sed 's/|//g' | sed 's/-//g' | sort| uniq > called.txt
	if ! [ -f all.txt ]
		nm -o full-version.a | grep -E '[T|t] ' | sed 's/:/ /g'  | cut -d" " -f2,5 ' > all.txt
	grep -Fvf called.txt all.txt > toremove.txt
	grep -Ff called.txt all.txt > used.txt

See reusable sample code here findobjects.sh

The non used functions and their associated objects will be in used.txt, the toremove.txt file will include the non used functions and the object files where they are defined. Its the object files in the file toremove.txt that we need to remove from the library.

The list of objects without function names can be generated by issuing the following command :

       cat toremove.txt  | cut -d" " -f2 > nonusedobjs.txt

Exclude non used objects from the library

Using the list of non used objects, we now need to tell the makefile not to include these objects in the library. We can do this by removing the objects from the list of objects in the makefile that will be archived in the library. Lets assume that the list of objects is stored in a variable called LIBOBJ. The first step is to make modify the variable declaration so as to have one object file per line. For example, the following declaration :

MYLIBOBJ= $(OBJ_D)/objfile034.o $(OBJ_D)/objfile045.o $(OBJ_D)/objfile026.o \
	$(OBJ_D)/objfile052.o $(OBJ_D)/objfile053.o $(OBJ_D)/objfile054.o \
	$(OBJ_D)/objfile022.o $(OBJ_D)/objfile023.o $(OBJ_D)/objfile027.o

should be modified to look liks this :

MYLIBOBJ= $(OBJ_D)/objfile034.o \
	$(OBJ_D)/objfile045.o \
	$(OBJ_D)/objfile026.o \
	$(OBJ_D)/objfile052.o \
	$(OBJ_D)/objfile053.o \
	$(OBJ_D)/objfile054.o \
	$(OBJ_D)/objfile022.o \
	$(OBJ_D)/objfile023.o \

This can be done in vim by selecting the initial declaration and issuing the following command on the visually selected area : “:'<,'>s/\.o \$/\.o \\\r$/g”

Now we can remove the non needed objects from the MYLIBOBJ variable by issuing the following command :

		grep -Fvf nonusedobjs.txt lib-makefile > new-makefile

Now we can compile the library with the new makefile and manually fix minor compilation issues if any.

Optional : Locating source files used to build the trimmed down library

Some times, you may need to find the source files that generated a set of objects. The following is a script that uses the makefiles of your project to do just that.

#This script takes as input a list of object file names and finds the source
#files that generated these objects according to the rules found in a given
MAKEFILES="$HOME/makefile1 $HOME/makefile2"
#The location of the source files of the project
echo "" > $RULES
#Expand some variables if needed
#(e.g. some file names may contain variables that need to be expanded) 
cat $MAKEFILES | sed "s:\$(OBJ_PREFIX):obj_:g" > $RULES
lookupsource() {
	#Find rules that contain reference to the object and extract the sources
	rrule=$(cat $RULES | grep -E $obj | grep "\-o" | grep "\-c")
	src=$(cat $RULES | grep -E $obj | grep "\-o" | grep "\-c" | perl -n -e 'if ($_=~/(\w*\.c)/) {print "$1\n"}')
	#Print the sources
	if [ `expr length "$src"` -gt 0 ] 
		echo "$src" | tr [:space:] "\n" | xargs -I% basename % | xargs -I% find $SEARCHPATH -name % -print
			echo "Source not found for $obj:$func"
while read line
	lookupsource "$line"

Get source file here findsource.sh

The scrip above can be run as follows :

cat used-objects.txt | findsource.sh
You could leave a comment if you were logged in.