How to extract titles from web pages in Java
Let’s say you have a set of URLs and you want the web page titles associated with them. Maybe you’ve data-mined a bunch of links from HTML pages, or acquired a flat file listing URLs. How would you go about getting the corresponding page titles, and associating them with the URLs using Java?
You could use an HTML parser such as Jsoup to request the HTML document associated with each URL and parse it into a DOM document. Once obtained, you could navigate the document and select the text from the title tag, like so:
String titleText = document.select("title").first().text(); |
Elegant, but a lot of overhead for such a simple task. You’d be loading the whole page into memory and parsing it into a DOM structure just to extract the title. Instead, you could use the Apache HTTP Client library, which provides a robust API for requesting resources over the HTTP protocol. But it would be unnecessary in this case. Let’s keep it simple and rely only on the java standard library.
Create thumbnails and avatars in Java
Avatars–icon-sized images used to represent people online–are usually square. Consider, Facebook avatars are 50 by 50 pixel images. Most source images are not square, however. Cameras typically take pictures with a 4:3 aspect ratio. So to create an avatar from an image, you must do two things: crop it to a square and resize it to your desired icon size.
Thumbnails present a similar challenge. When displaying thumbnails in an image gallery, you typically want to enforce at least one of width|height to be the same for all images to avoid a very sloppy look. The cleanest look is achieved with square thumbnails of equal size, though the necessary cropping may not be desireable in every case.
Read more
Handy RowMapper base class for Spring JDBC
RowMappers are needed all over the place in your Spring JDBC DAO classes. One challenge that I kept running into was that when I wanted to reuse a particular RowMapper
class for numerous queries, there was an ever-present threat of an underlying SQLException
if certain columns were not present in the ResultSet
. Obviously, a reusable RowMapper
will set every field on the object it maps for, however not every ResultSet
will include every field. Calling rs.getString("column_name");
will result in an exception being thrown if column_name
is not present in the particular ResultSet
.
So to solve this problem, I wrote this base RowMapper
class:
Read more
Define global lists, sets and maps in Spring configuration
Sometimes you may need to define stand-alone collections–lists, maps and sets–in your Spring XML configuration files, so that your beans can reference them as properties. This is easy to do using the util
schema. For example, let’s say you want to define a set of credit scores in your applicationContext.xml file:
<util:set id="creditScores"> <value>600</value> <value>710</value> <value>760</value> </util:set> |
Now you can reference the set in your beans, like so:
<bean id="creditBean" class="com.this.is.my.CreditBean"> <property name="myScores"> <ref local="creditScores"/> </property> </bean> |
Annoying: Netbeans code completion in comments
I don’t understand why Netbeans performs code completion when you are typing a comment. If I am working in a java file, typing a comment, and come to a word like “if” or “for”, Netbeans automatically inserts the code completion templates for those keywords. Shouldn’t it recognize that I am typing within a comment? Is there a fix for this issue?
EDIT: Another thing that drives me mad: I use Alt+F S to save. Constantly. I’m so trained in doing this that I usually don’t even think about it. In Netbeans, however, this often results in the letter “s” being typed, either because the File menu didn’t drop down fast enough or the S key registered twice. I’ve never had this problem in any other program.
How to load an image from a URL in java
Assuming you want to load the image from the URL into memory in order to display or manipulate it:
try { URL imageUrl = new URL("http//example/image.jpg"); InputStream in = imageUrl.openStream(); BufferedImage image = ImageIO.read(in); in.close(); } catch (IOException ioe) { //log the error } |
Loading an image from a URL is as simple as that. Now you’ve got a BufferedImage
.
It can be even simpler, and if all you are doing is cropping or resizing the image, you may be interested in a little java image utility I wrote. Using my ImageLoader
and Image
classes, the code would look like:
try { Image image = ImageLoader.fromUrl("http//example/image.jpg"); } catch (IOException ioe) { //log the error } |
The Image
class is a wrapper on a BufferedImage
and provides some useful methods for cropping, resizing, and writing (saving) the image. Let me know in the comments below if you’ve found this helpful.
Java member access levels – public, protected, private
Java access level cheat sheet:
*no modifier is usually refered to as package-private.
- Classes can be declared:
- public
- package-private
- Methods and member variables can be declared:
- public
- protected
- package-private
- private
It’s easy to remember the effect of public and private. One grants total access, while the other restricts access to the class itself. Both protected and package-private grant access to the package, but only protected grants access to subclasses.
Good practice dictates that you keep access as restricted as possible. This enforces encapsulation and helps forestall bugs in the code. Keep in mind, both public and protected members become part of your API forever (unless you want to break clients, which is not nice). Preserving encapsulation ensures you are not tied to a particular implementation and keeps your API clean.
More Here.
If you repost the chart above, please cite this blog. (icons from famfamfam)
Resize images in Java, preserving image quality
It shouldn’t be so difficult to do simple image manipulation in java. Resizing images is a frequently-encountered need, often to create thumbnails or to shrink pictures taken from digital cameras to a reasonable display size. But how to create thumbnails in java without sacrificing image quality? Standard library image manipulation is severely lacking in this area.
Luckily, talented java programmers have worked to create better solutions. I’ve thrown together an image utility, building off of the work of others, to expose a few basic image manipulation functions, namely: open (from a file, URL, InputStream or byte array), save to file, soften, resize, and resize to square. This may be useful to your project. Just read the important caveat toward the bottom of this post.
- Download: ImageUtil-1.11.zip
I make no warrantees about this utility. If you like it, a link back to this blog would be more than welcome.
Export Netbeans templates, font colors, and other settings
Once you’ve got your IDE set up just to your liking, it would be painful to start over with default settings. So what if you switch to a new computer? You’d want to take all your settings and preferences with you, not attempt to recreate them from scratch.
As far as I know, Netbeans does not have the ability to export / import your custom settings. That’d be a real handy feature. But for now we have to get by without it.
I’ve found two posts on the topic:
http://searjeant.blogspot.com/2007/10/migrating-netbeans-settings.html
http://wiki.netbeans.org/TaTCodeTemplateBackup
I will try these, when I get my new laptop set up with Netbeans, and will update this post with a story of success or failure.
UPDATE: I did try this, and it did work.
Here is what worked:
To transfer the settings, I copied the Editors, Preferences, and org-netbeans-api-project-libraries folders located in the .netbeans\6.9\config folder under my Windows user account to the second Netbeans config folder. (Note: I went from Netbeans 6.9 on one computer to Netbeans 7.0 on another.)
I was concerned that because the Netbeans versions were not the same, something would go awry. But rest assured, nothing blew up. I imagine certain version combinations may not play nicely together, though. YMMV.
To sanitize user content, use an HTML parser
It is especially important, if you allow any HTML at all in user-submitted content, to sanitize that content by actually parsing the HTML and filtering it for any tags or attributes you wish to exclude. If you fail to do so, your site may be vulnerable to XSS (cross-site scripting) attacks.
Q: “But isn’t it overkill to parse the HTML? Can’t I use other techniques, such as regular expressions or simple string replacement, to filter out dangerous tags and attributes?” A: No, and I’ll explain why. Read more
Validate that URLs exist using jQuery / PHP
It would be nice to have a pure javascript method of validating that URLs exist. One imagines you could use an AJAX call and verify the HTTP status code (200, 404, etc.) returned. However, browser security does not permit cross-domain AJAX calls. So, this method would only work if you are validating that URLs exist on the same domain.
Perhaps there is a way to use a hidden iframe to test the existence of a URL. I am not aware of a way to get the HTTP status code of a page that loads inside of an iframe, though. I’m not sure it is possible. So you must rely on javascript plus a server-side programming language to perform this validation. I chose jQuery and PHP.
IP Addresses and Geolocation – Using GeoIP
Did you ever wonder how some sites are able to tailor a message specifically to your city? You might have seen these in ads: “Albuquerque mom loses 50 lbs!” Hmm. How did the site know you live in Albuquerque?
The answer is geolocation based on the user’s IP address. And of course, geolocation can be useful beyond tailoring custom ad messages. The good news? Free geolocation functionality exists!