Category Archives: Foomatic

Pulling options from the sql db is done.

Thought I’d give an update to my Christmas plans. I ended up taking longer than just the Christmas break. I used the extra time to do some refactoring and cut the size of DB.pm by a thousand lines or so.

Filters will slowly be pulled from DB.pm and moved into a logical hierarchy within the new ‘filter’ directory. As it is now DB.pm is simply too large to wrap your head around. When I started this summer DB.pm was almost 10k lines. Since then it has slowly shrunk to 8k. Due to the size some dead code has built up.

An excellent example of this is get_javascript2() which the old openprinting website used. The funny part is that there is not get_javascript1() or get_javascript().

Oh and then there is comment_filter() which filters user input on a web page with regex. It actually works! Which is quite a feat considering the many got-ya’s with user input

My strategy right now is to remove the proper functionality into modules. My hope is that the dead code will then become more obvious.

Christmas plans, get_option()

The openprinting bzr repositories are back up. The commit process has gotten a lot nicer. Prior I had to create a bzr bundle (a large diff) and upload that through logger head which would then commit the changes. Between our changelog, my local bzr, and the remote bzr, three commit messages were required per commit. With the new commit process I can commit from the command line (as it should be) and the commit messages are pushed automagically.

Fun fact: the changelog is the second largest file in the foomatic-db-engine repository. 5.5k lines and 10 years of commit documentation. I also made a bit of a mess of it before I noticed theĀ  standardized commenting format. I blame gedit for not providing highlighting.

Over the Christmas break I intend to implement support in get_option() for sql backends. This is one of the two bugs I assigned to myself after the summer. Till would probably prefer I implement printer groups but sql support is a left over from the summer and it wouldn’t feel right to implement new features before I finished polishing my first feature.

Consolidating the foomatic-db option’s human readable namespace, for multi-lingual profit

Perhaps a bit verbose but the foomatic-db option’s human readable namespace refers to all the elements in our options that are meant for the end user. The foomatic-db data sets allow for multiple languages to be present in human readable elements. The issue is that only one xml in our repository uses this feature, the main proprietary Epson driver xml with its english and japanese comments.

Thus for all practical purposes foomatic-db is a mono lingual data set. This leaves downstream responsible for any translation.

Downstream may have an army of translators but they are going to have a hard time if the data set is inconsistent, ambiguous, and verbose. This is thus what I spent my remaining week of GSoC working on.

To do this I used Google Refine and a set of throw away scripts which created a csv with the xml filepath the human readable string.

With a global view of the namespace it became clear that it wasn’t as bad as I thought it would be. All Resolution options were consistently named, as were Page Size options, and others. Some things were simply misspelled or similar’arly spelled. Though in many cases I needed to do some digging to group things. In total I brought the total number of unique option choices from ~830 to ~740 with other uncounted improvements to readability. A brief and incomplete overview of the consolidation:

  • Consolidated on Color spelling of Color
  • Acronyms were expanded in some cases
  • Standardized on ‘Economy Mode’
  • Standardized on ‘Print Quality’
  • Expanded ‘600 DPI’ and like to ‘600×600 DPI’
  • Standardized on ‘Color Mode’
  • If I noticed them I would remove redundant terms like ‘setting’*

It wasn’t a major overhaul but hopefully this will result in more complete and helpful translations for end users.

*The user is already being shown a setting dialog. Visually the presence of a toggle or checkbox communicates that something is a setting. Appending ‘setting’ to a setting is thus redundant.

Final thoughts on GSoC 2011

Overall I am very happy with how my project progressed. The addition of xmlParse has reduced foomatic-db-engine by over 5500 lines of code, a 85% reduction. SqlLayer allows for near pyppd levels of performance for non-cups users. Perhaps most importantly I’ve grown as a programmer and am more familiar with our linux printing architecture.

A design decision I made early that I am especially proud of is phonebook.pm. With it I was able to write xmlParse and sqlLayer much more abstractly than their C and php counterparts, which meant a substantial reduced codebase . My only regret was not making phonebook even more general, it would have been a design challenge but I think it is possible. This might just be the second-system effect talking.

While GSoC 2011 may be over I do intend to participate throughout the school year. I have already assigned myself two feature requests[1][2] and I still have option name consistency to work on prior to the new semester. This summer’s work will ship in Foomatic 5.

sqlLayer, pushing foomatic data into the database

The second portion of my project is to write a perl lib to push foomatic’s data into a relational database. This would allow the use of SQLite instead of the xml database for foomatic-db-engine. This isn’t going to affect CUPS users (the vast majority of people) since last year’s project (vitor’s) created pyppd which can side-step foomatic-db-engine entierly for end users. What it does though is provide considerable performance increases for users of legacy spoolers.

Like with xmlParse I am not treading new ground, openprinting already has a script to import the data set. This script was written as part of another GSoC project two years ago as part of the openprinting website re-design. A few months before this year’s GSoC Till gave me a copy of the script along with a database dump. I was able to convert this database dump into a sqlite database. With those I’ve been able to make considerable progress. Currently I’m adding support for about one table a day.

Thinking about the project as a whole I am rather proud of the phonebooks, by extending them to document the database schema I’ve been able to operate at a fairly high abstraction level. Whereas the C programs and the PHP import script had 100s of lines of simple ‘if def assign’ the phonebooks let a single* loop handle all the simple renaming and processing for xmlParse. For complex types the raw data is handed to special case code to process.

Sample special case code for option’s complex data:
#The specific groups
} elsif ($group == 11) { #constraints
	setConstraint($node, \$perlData{$destinationKey});

} elsif ($group == 12) { #enum_values
	foreach my $subnode ($node->findnodes("./enum_val")) {
		my %enumValue;

		foreach my $subsubnode ($subnode->findnodes('./@id[1]')) {
			$enumValue{"idx"} = $subsubnode->to_literal;
		}

		foreach my $longnames ($subnode->findnodes('./ev_longname')) {
			$this->setHumanReadableText(\%enumValue,\"comment", $longnames);
		}

If I were to redo my work I would make the phonebooks document the structures of the complex data. This would allow an even further generalisation and do away with the special case code for the complex types.

That isn’t going to happen though, the current code has been tested and is working. And while the special cases could have been done better I do realise that a more general approach would have had a much harder time conforming to the behaviour of the C programs.

 

*Not necessarily a single instance of the loop. I’m a bit ashamed to admit but there are actually three copies of the same loop, one for each xml type. It is this way because when I created the phonebooks I made groups above 10 be namespace specific groups. Thus group 11 for an option xml is different from printer’s group 11. In xmlParse this is implemented by the fact that the option loop is separate from the printer’s loop. The groups that all loops share are in a separate function, so really only the loop structure is copy pasted. In sqlLayer I’ve kept the loop singular and simply added support for namespaces, support which will be made cleaner if I can think of a way.