Java First Contact
Background
Remember that I am sort of new to Java - I've read the first couple of hundred pages of Java in a nutshell ([1]) - and have only written hell o world before.
There is an excellent IDE for programming in Java: Eclipse (read more on [2] or [3]). I used Eclipse and my close friend Gnu Emacs.
Understanding Java is a little annoying since there are many things called Java (like .NET has the same name as a popular top domain):
- There is the Java Programming Language (see [4]). This is what you use to write your programs.
- There are many Java compliers (see [5]), including one from Sun, one from GNU and one from Eclipse. The GCJ (see [6]) can also compile into a regular executable.
- The compilers compile the java source code into java bytecode (see [7]). The byte code is what corresponds to the Microsoft Intermediate Language (or Common Intermediate Language), see [8].
- The Java Virtual Machine can now execute the java bytecode and we get what we wanted. The beauty of this that you can write and compile your program once and then execute on any platform that has a Java Virtual Machine!
- Last but not least there are a number of Java Platforms - or Java Core APIs. This is the impressive collection of classes that you use to write your high level programs.
When Microsoft created a number of extra nice and fancy libraries they wanted to insert into the Sun Java Platform something interesting happened. These libraries were only to target Windows systems, but since Java is intended to target all systems Sun did not allow the libraries in there. Microsoft got pissed off and created their own Java: .NET. It's not Java, but it has everything (more or less) Java has with some lessons learned.
Since none of the parts of Java* were free software the GNU project created at least some parts (I don't know which really). But recently Java relicensed and started using the Gnu General Public License - making it free software.
Goal
I copied some name-statistics from SCB ([9] and [10]) and was about to make a little python script to harvest some statistics from it. But instead I decided to implement it in Java.
Show me the code
Indata
In the namn.txt (download here: [11]), from the SCB homepage, only names with more that ten occurances at least one year during the last ten years are listed.
Since I just copied it from their homepage the file contains unwanted line breaks, tabs and spaces.
PojkNamn 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 Aaron 35 31 21 24 20 15 - 12 - - Abbas 11 - - - - - - - - - Abbe 39 27 21 16 20 11 - - - - ... Zion 23 16 - - - - - - - - Åke 14 19 17 13 10 10 - 13 - - ---------------------------------------------------------------- FlickNamn 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 Ada 12 14 14 12 12 - 14 - - - Adela 13 12 - - 12 - - 10 - - ... Åsa - - 17 - 10 - 10 18 13 13 Ängla 31 36 24 - 11 - - - - -
A name counter class
See NameCount.java (download here [12]).
I implement a my own class that has an integer and a string member. Inheritance from for example the String class would of course have been possible since we might want to do things like:
if (nc1.startsWith('Per')) { // ... }
Also you might want to make the members private - but I didn't want to take the time to do so.
The class implements the interface Comparable so that we later can sort lots of namecounters.
public class NameCount implements Comparable<NameCount>
The interface requires us to code out own public int compareTo(NameCount nc) and this is a sloppy and stupid implementation but it does exactly what I want. It sorts namecounters with a large n as small. If two counters have an equal n then it sorts alphabetically. The body of the function is:
int mdiff = this.name.compareTo(nc.name); int ndiff = this.n.compareTo(nc.n); if (ndiff != 0) return -ndiff; else return mdiff;
The class also contains a constructor and a to string method.
Some kind of parser
My quick and dirty indata parser is SomeParser.java (download here: [13]) basically contains code that reads the indata file, makes a sum of the occurrences of each name in an own instance of a namecounter in a vector (could just as well have been a List of some sort).
Then it java.util.Collections.sorts the list and prints it.
Compile it
$ ls -l total 96 -rw-r--r-- 1 per None 646 Aug 7 09:01 NameCount.java -rw-r--r-- 1 per None 1928 Aug 7 10:47 SomeParser.java -rw-r--r-- 1 per None 57878 Aug 5 11:37 namn.txt $ javac *.java
Use it with BASH
Names that are almost as popular as...
$ java SomeParser namn.txt | grep -C 1 "Knut,\|Per,\|Anna," 61 Sofia, 4151 62 Anna, 3999 63 Johan, 3979 -- 361 Emilio, 359 362 Per, 359 363 Pelle, 357 -- 594 Alba, 138 595 Knut, 138 596 Viking, 138
Most popular names that starts with a ...
$ java SomeParser namn.txt | grep "\ A" | head -n 3 4 Alexander, 9652 5 Anton, 9625 12 Amanda, 7803 $ java SomeParser namn.txt | grep "\ B" | head -n 3 89 Benjamin, 2884 268 Bianca, 605 292 Beatrice, 535 $ java SomeParser namn.txt | grep "\ C" | head -n 3 71 Carl, 3741 85 Clara, 3104 108 Casper, 2367
This page belongs in Kategori Programmering