In this article, we’d like to address a burning problem which is the lack of smart ML/DL solutions in Java. That was the reason why at some point we decided to turn away from the Gison and Patterson’s book on DL4J. Today’s article by Humphrey Sheil indicates that was probably the right decision. We invite you to read the author’s thoughts on how to make Java competitive with Python in terms of machine learning.
Humphrey Sheil recently gave a talk on the present and future of ML / DL in the enterprise. A large enterprise take interest in more applicable issues than those discussed at a research conference (e.g., “How do Humphrey Sheil start using ML?”, “What is the best way to integrate ML with my existing systems?”, etc.) The talk continued with a panel discussion on Java and machine learning.
The problem is that Java is absent in the machine learning scene. ML frameworks written in Java are very few. Of course, there is DL4J but he doesn’t know anyone who would use it. MXNet has an API built on Scala API but not on Java. Also, Tensorflow has an incomplete Java API. Java plays a huge role in the enterprise. Over the last 20 years, companies have invested trillions of dollars in Java in a plethora of fields, such as finance, online transactions, online stores, telecom, etc. The list can go on and on. When it comes to machine learning, the main language is Python, not Java. He personally likes working with both Python and Java. At the same time, Frank Greco posed an interesting question that got him thinking:
“Java doesn’t necessarily have to compete with Python in ML. Why can’t Java provide a true ML support?”
Is it important?
Let’s go deeper into the issue. Since 1998, Java has been in the top of programming languages, being an integral part of all major solutions, including web technologies, mobile apps, browser vs native, messaging systems, i18n, and l10n globalization, scaling out, repositories for all kinds of enterprise information, from relational databases to Elasticsearch.
Such popularity encourages Java teams to believe in themselves and work harder to achieve new challenges. There no “magic” component or APL that cannot be improved or replaced with a professional team of Java developers.
However, when it comes to machine learning, this is not the case. In this situation, Java teams have two possible ways to go:
- Learn
- Use a vendor API to add the ML capability to your existing enterprise system
However, neither of these scenarios is good enough. The first one requires massive time and money investments, plus maintenance costs. The second one implies dependency on the vendor. Plus, dealing with third-party components is rather risky. It means the necessity to your share sensitive information with people outside your enterprise. In some situations, this scenario is not acceptable.
Also, there is an even bigger problem – cultural attribution. Teams cannot change the code they don’t understand or don’t know how to maintain. As a result, there is no clear distribution of responsibilities, and the main job has to be delegated to someone else. The teams consisting solely from Java developers risk missing the machine learning wave that will be dominating in enterprise computing in the near future.
With that said, it’s important that the Java language and platform finally get a first-class ML support. Otherwise, there is a risk that in 5-10 years Java may be replaced by other languages with a better ML support.
Why is Python dominating in ML?
For starters, let’s find out why Python has become the leading language in ML and DL.
I presume it all started with a pretty “innocuous” feature, which is list slicing. This support is extensible: any Python class implementing the __getitem__ and __setitem__ methods can be sliced using this syntax. The fragment below shows how simple and natural this Python’s function is.
This is not all, of course. Compared to the “older” Java code, Python code is much more concise and brief. Exceptions are supported but are unchecked. Plus, developers can easily write draft-quality Python scripts to test things out without “drowning” in the Java approach “everything is a class.” In other words, Python is easier to work with.
However, the main Python’s advantage is that the Python community managed to build NumPy, which is a smooth and fast numeric computing library. Numpy is built on ndarray, which is a N-dimensional array. Here is a citation from the official documents: “NumPy’s main object is a homogenous multi-dimensional array. It’s a table of same-type elements (usually numbers) which are indexed using a tuple of positive integers.” NumPy gets your fata into a ndarray and performs various operations on these data. NumPy supports a plethora of indexing, broadcasting, and vectorization methods. Plus, it allows developers to easily create and manage large arrays of numbers.
The snippet below showcases ndarray indexing and broadcasting, which are the key operations in ML / DL.
Working with large multidimensional arrays of numbers is the heart of ML programming (and DL programming, in particular). Deep neural networks are the “lattices” made from nodes and edges and modeled by numbers. Run-time operations require fast matrix multiplication.
There is a number of libraries based on NumPy, such as scipy, pandas, and many others. Major DL libraries (Google’s Tensorflow, Facebook’s PyTorch) are heavily focused on Python. Although Tensorflow has other APIs for Go, Java, and JavaScript, they’re regarded as incomplete and unstable. Originally written in Lua, PyTorch experienced a peak of popularity in 2017 when it switched from that niche language to ML Python.
Disadvantages of Python
Python is not a perfect programming language. For example, CPython has a global interpreter lock (GIL), which makes scaling quite a challenge. What’s more, Python DL frameworks (e.g., PyTorch and Tensorflow) still hand off the key methods to opaque implementations. For example, NVidia’s cuDNN library has had a major effect on the RNN / LSTM implementation in PyTorch. RNN (recurrent neural networks) and LSTM (long short-term memory) are highly useful DL tools for business apps because, among other things, they focus on classifying and predicting sequences of variable length (e.g., web clickstream, text fragments, user events, etc.).
In all fairness, it must be said that Python’s opacity is characteristic of almost any ML/DL framework, except for those written in C or C++. Why? Because to maximize performance for core high-frequency operations (e.g. matrix multiplication), developers try to get as “close to the metal” as possible.
How can Java become competitive?
To turn Java into a healthy and smart ML eco-system, three main improvements are needed.
- Add indexing/slicing support in the core language. This way, Java will be able to compete with the concise and easy-to-use Python. To introduce these capacities, we might need to use the existing ordered collection List<E> interface. Plus, we also need to admit overloading (see point 2).
- Build a tensor implementation in math. That set of classes and interfaces would perform the same functions as ndarray and provide additional indexing support, i.e. support the three indexing types available in NumPy: field access, basic slicing, and advanced indexing essential for coding.
- Implement broadcasting (scalars and tensors of arbitrary (yet compatible) dimensions).
If we could implement those features in the core Java language and runtime environment, we would get a green light for creating NumJava, an alternative to NumPy. Also, we could use Project Panama to provide a vectorized low-level access to fast tensor operations implemented on CPU, GPU, TPU, etc. and thus making Java ML the fastest computing package.
I’m not saying these improvements are trivial. However, they could take the Java platform to a whole new level.
In the snippet below, you can see how NumPy broadcasting and indexing could look in NumJava with a Tensor class, slicing syntax support, and the current restriction on operator overloading.
Prospects and call to action
It’s a known fact that machine learning will revolutionize the business world just as relational databases, the Internet, and mobile technologies did before. While there is plenty of hype around ML, one comes across some clever and convincing insights. For example, this article looks at the near future, where the optimal database, web and application server configurations will be learned using ML technologies. You won’t even have to deploy ML in your system! Your vendor will do that for you.
According to the article, we can write as many ML and DL frameworks (running on JRE) in Java as we have web, persistence or XML parsers. Can you imagine that? We could have Java frameworks with support for CNN’s (convolutional neural networks) for super-advanced computer vision, recurrent neural networks implementations for sequential datasets with advanced ML capacities like automated differentiation, and more. Those frameworks could be used to implement and power the next generation of enterprise systems. Those systems will smoothly integrate with the existing Java systems and use the same tools like IDE, testing frameworks, and continuous integration. What Java fan wouldn’t like such prospects?