Category Archives: Foomatic

Mid way point reached, xmlParse is ready for production

Around the 20th of June I told Till I wanted to finish up combo generation by the first of July and take twenty days to integrate xmlParse(my perl library) into DB.pm. I missed by first deadline and didn’t get to start integration until the ninth. Oddly enough here it is the twenty first and the lost known bug in xmlParse has been squashed, I ended up meeting my deadline anyway.

This is after testing the results of the foomatic 4 stable branch’s foomatic-compiledb script against the trunk’s xmlParse based one. Over 6.6k flat file ppds exactly the same, and a few different ones due to a change in how the maximum resolution of a pair is determined if a printer claims a lower resolution than the driver’s default.

This isn’t success for my GSoC project yet, but it is the biggest milestone. Over 7k lines of C (not even C++) replaced with 1k lines of Perl and 400 lines of data (xmlParse is enterprise-y in that the xmls are mapped to the internal perl data structure using data in the phonebook.pm lib).

Till has been a great mentor and really pushed me to get the job done right.

My first experince with profiling

During integration Till noticed that generating overviews was, slow. This was the slowest operation and the only one where the new Perl library fairs worse than the C so I am not surprised that it would come under inspection. Till tasked me with improving the performance. I was doubtful that improvements were possible, I was wrong. Once I profiled using NYTProf a clear bottleneck appeared, over half the run time was being spent cloning data structures. Many of these clones were unnecessary but resulted from me having removed the conditional cloning(I thought it made the code cleaner).

So that was my first experience with profiling. All together it went perfectly, even if only to uncover a bug of my own making.

Satisfaction

I’ve been using what I assume is a form of unit tests during the later stages of each implementation. What happens is that the output of my Perl implementation is compared to the C’s, if a difference is found processing stops and the two data structures are printed. Once I’ve brought the Perl inline with the C I test again, rinse and repeat. Each time the time between failures increases.

For the other data structures processing everything never took more than a minute, with combos it takes up to seven minutes to process everything. As I work on brining the Perl inine with the C I find myself sitting for increasing lengths of time. Thus satisfaction, the knowledge that the longer it takes the better the Perl has gotten, =}

Processing Constraints, a solution

Yesterday I ran into a conundrum processing the constraints of options. I had a solution but I wasn’t happy with the cleanness of it. I slept on it and this morning I’ve got something I’m much happier with.

Instead of summering the constraints in a structure like this:

driver => option =>  printer => [sense, default]

I used a structure like:

printer=> driver => option => [sense, default]

If a constraint does not specify a printer then it goes under the ‘*’ printer hash, like wise a driver can also be ‘*’.

This also allows me to simplify retrieving the constraints by merging it with the operation that determines which options to inspect for a particular driver printer pair. This happens in the function getOptions($relations, $printerName, $driverName), which now only merges the lists of options from the three possibilities (driver, printer, and driver-printer) overwriting keys with those of more specific possibilities.

Now I can move onto equivalency testing of parseCombo and the C implementation.

Processing Constraints, a conundrum

Edit: The solution I ended up going with.

Constraints are a element in the Option XMLs and determine what options match a driver / printer combo. They like this:

'constraints' => [
			  {
				'printer' => 'printer/Lexmark-Z12',
				'sense' => 0
			  },
			  {
				'printer' => 'printer/Lexmark-Z31',
				'sense' => 0
			  },
			  {
				'printer' => 'printer/Lexmark-Z31',
				'sense' => 1
				'driver' = > 'lxm3200-tweeked'
			  }
			],

While they look simple I’ve run into a problem, how best to fit this into a data structure for easy lookup.

The printer and driver XMLs contain no references to options. Instead an option contains these constraints which specify which drivers and printer pairs apply to the option. Unfortunately generating a combo is driven based on the driver and printer data. Because of this we have to compute an in memory cache of these constraints in order to match a combo to its options (or reprocess the entire set of option xmls for each combo (which would be absurd (rather like these nested parenthesise( ( ))). The data structure of this cache is that has me troubled. I have a working structure, but I’m sure that if I was a better computer scientist I would have known something better than this:

'hpdj' => {
	'7' => {
	   'HP-DeskJet_540' => [
						   1,
						   'ev/137'
						 ],
	   '*' => [
				1,
				'ev/140'
			  ],

		 },
	'94' => {
		'*' => [
				 1,
				 '60'
			   ]
	  },
	},
'deskjet' => {
   '6' => {
		  '*' => [
				   1,
				   'ev/130'
				 ]
		},
   'GS-HalftoningAlgorithm-Gray' => {
		'*' => [
				 1,
				 'ev/GS-HalftoningAlgorithm-Gray-Standard'
			   ]
		}
	},
'printers' => {
	'HP-DeskJet_500' => {
			   '6' => {
				  '*' => [
						   1,
						   'ev/130'
						 ]
				},
		},
	},

So let it be said, even in a project as code monkey as processing XML computer science finds a way to sneek in.