Java ML infrastructure with Pyspark for Java backend

Avatrin · Nov 14, 2019

Lets say I have experience creating ML models in Python, and have decided on training my models on Spark using Pyspark. This will form part of an ML infrastructure for a website with a Java or C# backend. How can I make this work? I am a beginner when it comes to Spark.

I am looking for any tutorial(s) that show me how to create a complete ML cloud infrastructure which can be trained with Python but accessed through Java or C#.

QuantumQuest · Nov 23, 2019

First, if you don't know much about Apache Spark you can read through this tutorial from tutorialspoint.com. As prerequisites before this reading, you must have some prior exposure to Scala programming, (at least) to the basic database concepts and some experience on some Linux distro.

Then - if you haven't already done it, you must learn how to make ML pipelines with PySpark. There are tutorials for this, like this from tutorialspoint.com. There are of course other good tutorials as well, which you can find by googling.

Now, for a Java backend you ask and assuming that you are thinking about writing everything related to your ML model(s) in Python and then calling a Python script in Java, you can write Java code to do some processing - for example some form of batch processing task(s) in order to do some predictions in a deep learning model , export the preprocessed data to .csv or .json format and then call your Python script from bash for instance, passing the parameters. You can take a look at http://digital-thinking.de/how-to-using-deep-learning-models-within-the-java-ecosystem/ example of using deep learning models in Java ecosystem at digital-think.de, in order to get the idea of the process.

Needless to say that in order to accomplish the specific goal(s) you have, you'll need to mix and match things accordingly. I don't think that you can find a start-to-finish tutorial for the whole thing regarding the specific goal you have in mind.

Java ML infrastructure with Pyspark for Java backend

Thread 'For those who ask: "What programming language should I learn?"'

Similar threads

How to increase phone signal strength by lying about it

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

Who is responsible for the software when AI takes over programming?

How to calculate Tension for a series of connected points?

Learning Assembly and computer architecture for x86

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers