Tales of an Indie Developer: programming

Showing posts with label programming. Show all posts

Tuesday, January 23, 2018

Your Documents Under the Magnifying Glass

A few years ago I moved my household administrivia to a paperless system. Instead of stacking file folders deep with bills and statements, everything would be scanned & shredded. This greatly helped with storage space - but in a couple of years I ended up with a network drive filled with over 3,000 PDFs, images and documents. Bear in mind the majority of these are scanned documents - so the contents are images instead of machine-readable text. Everything was dumped into a single directory and files were named based on the timestamp of when they were scanned, taking hours to organize documents into folders and sub-folders.

Instead of burning hours sorting documents I started burning hours building a simple set of applications that would read document metadata, attempt to convert the images to text, group documents by common letterhead and then provide a simple search interface over all of it. Since optical character recognition is hit-and-miss, any full-text search should permit proximate indexing and searching to allow for fuzzy matches.

In the end I created two apps: DocMag and DocIndex. DocMag serves as the search front-end and allows users to perform full-text searches on scanned documents, label them with tags and automagically group other documents with the same letterhead or logo. The interface is pretty spartan and uses Spring Boot to build a straightforward integration into Elasticsearch. DocIndex is the batch process that crawls a filesystem and parses the documents using OCR, generates thumbnails, tags similar documents using computer vision-based template matching, and stores document metadata within Elasticsearch.

DocMag was created in Groovy using Spring Boot (Spring Web, Spring Data, etc). I did this mainly to understand how Spring Boot's conventions translated over to the Groovy world... it had been quite a while since I had worked with Grails. It turns out that Groovy, Spring Boot and Thymeleaf complemented each other quite well and make for fairly simple web development.

DocIndex was created with Spring Boot and Java 9 initially. I griped in an earlier post about my problems with Java 9's dependency management, so instead I fell back to the lambda expressions and work queue management within Java 8. This permits multithreaded parsing of discovered files, which then allows for vertically scaling document indexing by adding cores. Horizontal scaling should be possible by replacing the in-memory work queue with a proper shared message broker. There is a "reminder" issue I've already filed to migrate to a proper broker so this can be done sometime in the future.

Both DocMag and DocIndex are deployed as containers within DockerHub. This was especially necessary with DocIndex, as it relied heavily on native libraries for Tesseract OCR and OpenCV. OpenCV was the most contentious - each Linux distribution has a different version of OpenCV, and the version changes quite rapidly. Building containers for distribution allowed me to ensure users got the correct version of native libraries that worked well with their Java bindings.

Another nice feature of the containerized deployment model was composition - I was able to pair the correct revision of Elasticsearch, conditionally include Kibana, and provide a simple web application firewall by placing DocMag behind modsecurity and Apache. Network connections could be maintained between Elasticsearch, modsecurity, and DocMag without any of these interconnects leaking to the "outside" world, allowing me to do things such as only expose modsecurity to outside traffic and only permitting DocMag to receive requests through modsecurity. Elasticsearch could be hidden as well, only available on the internal network managed by Docker Compose.

Deployment can be relatively straightforward; since everything is deployed to Docker Hub as a container, one should just need to download the docker-compose.yml file and issue export DOCUMENT_HOST_DIR=/mnt/documents && docker-compose up -d. This should provision a single-node Elasticsearch instance, start DocMag behind modsecurity, and begin indexing with DocIndex.

If you are stuck digging through mountains of scanned documents, give DocMag a try. Ease of installation is one of its primary goals - so let me know if you find any issues getting it running!

Saturday, February 04, 2017

Alarm Clock Hacking by Blocks

A little over two years ago I built an alarm clock intended for hacking by kids, using a web-based Python IDE. When I tested the lessons, I found that kids didn't like messing with Python and only learned enough to get things barely working. Yet, when it came to Scratch Jr or the desktop version of Scratch, they would spend hours at a time. I needed to find a more approachable way to code.

Recently I discovered Blockly, a product from Google for Education. With that framework you can code by blocks and use its transcoder to output JavaScript, Python, Lua, Dart or (ugh) PHP. The transcoder runs entirely client-side, and the output is human-readable - well indented and even commented.

Writing custom blocks turned out to be an easy thing, so I created blocks to modify the LED display, send audio out to a speaker, or react to button presses. Now you can use blocks to program the clock, while retaining all the functionality present in the older Python interface.

If I was going to redo the Hack Clock, this time I wanted to have a presentable site with full hardware and software lessons, for both Python and Blockly. I revamped the Hack Clock website, completed the Python lessons that I left incomplete last time, wrote new Blockly lessons for the new IDE, and completely re-did the hardware how-tos. Lesson writing took up the lion's share of time, since they all needed new images and better testing.

Another bit o' feedback I had received was that installing the Hack Clock software was too much of a pain. I tried to make this a bit easier this time by offering releases within a Debian pkg, although you still needed to use apt to install dependencies. Still, this cuts down installation from over an hour to about ten minutes... and most of those ten minutes is spent twiddling your thumbs while you want for packages to download and install.

The hardware needed tweaking as well. It turns out the Raspberry Pi headphone jack is just a PWM pin hack and it seemed that GStreamer sometimes just couldn't grok it. The headphone jack was never a complete solution either - it required a discrete amplifier to power speakers, and soldering wires onto a 1/8" jack is a GIGANTIC pain. To make the audio hardware easier to cope with, I moved away from the headphone jack to Adafruit's I2S decoder and amplifier. It provided better audio and cleaner installation without increasing my part count or price. It has proven out to be easier for everyone so far.

The old Hack Clock had another embarrassing flaw: it could only handle one button input and couldn't manage output at all. That drove me nuts and was probably the second biggest thing I wanted to fix. With the latest release the Hack Clock can handle as many buttons as you have GPIO pins, and you can also drive output pins as "switches" in code. The code-by-blocks IDE could deal with buttons and switches as simple function blocks - which meant reacting to user input became much easier to code.

Once things were ready, I installed the Hack Clock software in a mission-critical environment: kids' rooms. So far things have gone well; audio has been more reliable than with the headphone jack, and they have been able to tweak the software more easily than with Python. One bit I noticed this round however: kids don't like looking down to read something, then looking back to code it. The next generation Hack Clock should have an interactive demo to guide through the lessons so they never have to glance away from the IDE.

I'd love to hear what other people experience when they try to get the Hack Clock running as well. A hardware list is posted on Hackaday, and all the instructions are at http://hackclock.deckerego.net/. Let me know what you think!

Sunday, February 21, 2010

Camel Integration

Sweet monkey spit... did I just miss January entirely? Is it seriously February? Dang.

One thing I have been researching lately has been enterprise integration frameworks. I'm a big fan of thinking in patterns while doing design and development and that goes double for architecting systems as well. When I sit down to architect a system I do my best to first think in terms of Enterprise Integration Patterns, then in terms of the Gang of Four's Design Patterns, and then in terms of implementation.

For a current project I've completed the architecture and a good chunk of design, so now I'm looking at implementation. I need to tie together a lot of disparate views to a lot of disparate services, which originally sent me down the path of evaluating Enterprise Service Bus products. I liked the content based routing using Drools utilized by JBoss ESB and ServiceMix. The orchestration of Chainbuilder was also a nice move forward. All had proper integration with naming and directory services as well. Ultimately, however, I found that most of the functionality within an ESB solution simply wouldn't be leveraged; the authentication and authorization model would need to be refactored, I didn't really need transactional support and there was no business process management to speak of.

I really just needed component adapters, translators and content-based routing. Slimming things down to just an integration framework left two main choices: Apache Camel and Spring Integration. Both offered the decoupling I needed and mirrored the Enterprise Integration Patterns I had already used in my architecture layouts. Even though these two frameworks are remarkably different in their implementation they are difficult to contrast. The rumor is that this similar-but-different approach is well known, in that the Camel team originally envisioned becoming a part of Spring. Of course Spring decided to roll their own ultimately to better fit their view and style (especially in a Spring 3.0 world).

Probably the best comparison is provided by actual, concrete examples of event notification on Hendy Irawan's blog: one written with Spring Integration, another with Apache Camel. The examples are written to leverage Spring Remoting, a remote method invocation mechanism not unlike Jini's remoting mechanism. While the examples do have to jump through the hoops of creating a proxy object, the meat of the comparison is in the application context of the two examples. Note that the Spring Integration configuration is a bit more readable and maps more distinctly to an integration pattern layout. The Apache Camel configuration just looks like a standard Spring bean context, however it uses URIs for connecting components rather than using dependency injection to connect them.

Ultimately I like the convention of having endpoints addressable by URIs rather than keeping them as actual bean references. For my tastes this makes things more agnostic (even from Java itself) and more coherent. Having URIs indicate addressable resources is a convention most developers are familiar with, if not just by writing curl scripts.

There are a lot of facets to view when attempting to evaluate integration frameworks and service buses. Gunnar Hillert's blog can give you a small peek into how wide of an ecosystem this really is - you can't just perform a straight-forward SWOT analysis any more. One must always architect first, design second then look at what implementation can get you their with ease and speed.

Tuesday, August 04, 2009

Memory Mis-Management

After cracking open Qt Creator and picking up Qt 4.5 development quite nearly two years after putting it down I found my C/C++ to be really wanting. All the habits I had developed earlier had simply leaked out of my head. I hadn't thought in terms of delete/malloc/free/pointers/references/virtual functions in so long that those neurons had since been re-allocated to other important devices, such as figuring out how to get americanos out quickly without breaking the espresso maker.

My brain just doesn't shift from domain to domain like it used to. Recently I was working on reducing some sort of algebraic expression of matrix transformations or some crap when a visiting fellow asked about normalizing data in an RDBMS. My brain shifted without a clutch. I kinda sat there, utterly stupefied, while my noggin tried desperately to come to terms with a) what words actually meant in the English language and b) how to shove data into a database table.

My brain is currently doing that with C++ memory management, too. Valgrind has very politely brought to my attention that my app is leaking like a freaking waterfall and my pointer management is beyond stupid. I needed a boot to my brain to make it jump back to C++ object-land.

Evidently my brain is not the only one that Java has softened. Not too long ago the Amarok team noticed that an influx of Java programmers brought with it fairly poor memory allocation habits and posted "Tips on memory management with C++ and Qt" to the mailing list. Both the message itself and the following responses I found interesting... they gave a quick synopsis of things that Javabrains do incorrectly when having to think in Qt's C++ garden.

I started reading Appendix B of Mark Summerfield's First Edition of C++ GUI Programming with Qt 4. The appendix, "Introduction to C++ for Java and C# Programmers," skips extraneous lessons concerning object oriented programming and directly addresses the C++ conventions that have since escaped my memory. The language in the book is direct and approachable; now that I'm into it the practice of everything is starting to come back to me now. Hopefully now I won't make stupid inheritance mistakes with virtual functions.

The paradigm of passing by value vs. passing by reference takes breaking some tough habits, but Qt is helping me out. Valgrind telling me of abandoned and undeleted objects finally reminded me why every object in Qt needs a parent - the removal of the parent needs to signal the removal of all children. I also need to be more disciplined in the use of QPointer to pass around references. Just as Crystal Space's smart pointers saved me numerous times in the past I'm sure Qt's smart pointers will save me from myself as well.

Saturday, August 01, 2009

Create a Qt

Downloaded and been playing with Qt Creator a bit. Previously I was using KDevelop for Qt 4 development, which worked alright. It had fair integration for Valgrind and GDB, and the editor worked fairly well. It had a few hooks for qmake and handled the Qt project building process fairly well.

They don't have a native KDE 4 KDevelop just yet. Not a huge deal... I could easily install it & tweak it for my projects. Before I did, however, I thought I'd give Trolltech's Qt-centric IDE a spin.

Trolltech says Qt Creator's focus is ...not [to] solely focus on a big feature list, but also on small details which make your life easier. Such a goal describes the project fairly well; I was pretty impressed with how easy it was to carry my KDevelop project over. Qt relies on project files (instead of Makefiles or configure scripts) for determining build flags and resources. Those same build files were directly imported into Qt Creator and set up the IDE likewise. Library dependencies were set up right off the bat; no problems at all. Just a click and builds were running immediately.

Debugging is integrated quite nicely. Since the vast majority of my time is spent in either NetBeans or Eclipse working with Java EE 6 stuff I've grown accustomed to robust and very granular debugging that allows me to dig deep into variables of every scope. While GDB only lets me go so far, Qt Creator presents the info fantastically and allows me to drill into objects in a very familiar way.

I think Java and the vast software stack around it is still the best way to engineer enterprise or academic applications. From Lucene to Stanford's Log-linear Part-Of-Speech Tagger it seems that most services and library software engineers would rather work within a fast virtual machine and forgo worrying about memory allocation or debugging backtraces.

Still, one has to wonder about where the wind will blow Java now that Oracle has swallowed the Sun. Java on the desktop, despite attempts with Swing and JavaFX, just hasn't received the attention that it needs. It's to the point where I had to write native code to get the system properties I wanted. Java 6 update 10 was a huge step forward, but someone needs to carry the torch. I imagine that Oracle would shelve desktop Java just as it might for a myriad of other Sun technologies.

With Oracle taking Java some unknown direction and the Java desktop still needing attention, a framework / build environment such as Qt 4 stands in the gap nicely. It's a huge compromise between the ease of engineering with Java and the native accessibility that comes with C/C++. I worry about memory management (somewhat) less when sticking with Qt conventions and Qt Creator / GDB gives me nice debugging that approaches that of a JVM. It makes me wonder if my long languishing desktop apps could stand a Qt 4 re-write.

Monday, April 27, 2009

What I've Seen with Your Eyes

Eskil posted the video for his GDC presentations and they're abso-freakin-lutely amazing.

The gameplay of Love was interesting, but the video displaying the tools Eskil created are completely mind blowing. It's the stuff that actually gives you hope for the world again. The GDC tool video shows off several tools Eskil has released: Loq Ariou which allows you to create assets & models with the same ease as a pencil & scratch paper, Co On which provides scene mapping that's startlingly similar to how you might visualize things in your own mind, and Verse, a data transfer & protocol standard that allows such data to be shared instantaneously between applications.

Obviously Eskil had to create an intelligent set of tools to properly build Love within a decade, but I had no idea he had constructed such a cadre of tools that could be re-used by other developers. Not only does he speed content generation up and provide better interfaces - he goes one step further by breaking down human factor boundaries that plague every other asset generation tool to date. Just watch the video - especially the portion demonstrating shaders in Co On - and you'll see why I'm going completely nuts over these releases.

Eskil is giving back a huge amount to the community at large with these tools, and is likely opening the doors for many, many others to creatively express themselves in ways that were once prohibitively difficult. Love isn't just creating a fanbase... it's creating a legacy.

Friday, December 08, 2006

Computer Science Isn't Science

...it's math.

The inside joke in universities is that if a subject has the word "science" concatenated onto it, it's not really a science. "Social Science" isn't a science. "Computer Science" isn't a science. Physics is. Chemistry is.

I think it's true. What's now "Computer Science" (or in other completely nonsense realms, "Informatics") is really mathematics. I think a lot of modern colleges and universities are getting it completely wrong... don't put CS in with engineering, vocation or *gasp* business. Put it where it belongs.

The knowledge and understanding of algorithms is what separates decent coders from great ones. Nowhere is this demonstrated better than on Beyond3D's Origin of Quake3's Fast InvSqrt(). Here we try to trace back a very elegant, fast and extremely effective five lines of code to its original author. The understanding and anthropology of this Newton-Raphson inverse square codification acts as a veritable who's-who in 3D real-time rendering, from Carmack to Gary Tarolli.

Take a look at "Exceptional lC++", reviewed by the good ol' Register. This is not just a trove of complex but simple C++ snippits - it can be a litmus test for those rare C++ hackers that can change the physical properties of the world with a mere wave of their hand.

For those of us who are still on the intermediate side of the scale, S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani have been releasing drafts of their textbook, "Algorithms," to the general laudations of the programming populous. It really is a fantastic resource, if for no other reason than to have such a nice reference on-hand on-line.

Tales of an Indie Developer