Java ML infrastructure with Pyspark for Java backend

  • Thread starter Thread starter Avatrin
  • Start date Start date
  • Tags Tags
    Java
Click For Summary
To create a machine learning infrastructure using PySpark for a website with a Java or C# backend, it's essential to start with foundational knowledge of Apache Spark and ML pipelines in PySpark. Beginners are encouraged to explore tutorials on platforms like tutorialspoint.com to understand Spark basics and ML pipeline creation. For integration with a Java backend, the process typically involves writing Python scripts for ML model training and then invoking these scripts from Java. This can be achieved by exporting preprocessed data into formats like .csv or .json and using Java to handle batch processing tasks. Resources such as digital-thinking.de provide insights on integrating deep learning models within the Java ecosystem. However, a comprehensive tutorial covering the entire setup may not be readily available, necessitating a tailored approach to combine various components effectively.
Avatrin
Messages
242
Reaction score
6
Lets say I have experience creating ML models in Python, and have decided on training my models on Spark using Pyspark. This will form part of an ML infrastructure for a website with a Java or C# backend. How can I make this work? I am a beginner when it comes to Spark.

I am looking for any tutorial(s) that show me how to create a complete ML cloud infrastructure which can be trained with Python but accessed through Java or C#.
 
Technology news on Phys.org
First, if you don't know much about Apache Spark you can read through this tutorial from tutorialspoint.com. As prerequisites before this reading, you must have some prior exposure to Scala programming, (at least) to the basic database concepts and some experience on some Linux distro.

Then - if you haven't already done it, you must learn how to make ML pipelines with PySpark. There are tutorials for this, like this from tutorialspoint.com. There are of course other good tutorials as well, which you can find by googling.

Now, for a Java backend you ask and assuming that you are thinking about writing everything related to your ML model(s) in Python and then calling a Python script in Java, you can write Java code to do some processing - for example some form of batch processing task(s) in order to do some predictions in a deep learning model , export the preprocessed data to .csv or .json format and then call your Python script from bash for instance, passing the parameters. You can take a look at http://digital-thinking.de/how-to-using-deep-learning-models-within-the-java-ecosystem/ example of using deep learning models in Java ecosystem at digital-think.de, in order to get the idea of the process.

Needless to say that in order to accomplish the specific goal(s) you have, you'll need to mix and match things accordingly. I don't think that you can find a start-to-finish tutorial for the whole thing regarding the specific goal you have in mind.
 
  • Like
Likes Avatrin
Learn If you want to write code for Python Machine learning, AI Statistics/data analysis Scientific research Web application servers Some microcontrollers JavaScript/Node JS/TypeScript Web sites Web application servers C# Games (Unity) Consumer applications (Windows) Business applications C++ Games (Unreal Engine) Operating systems, device drivers Microcontrollers/embedded systems Consumer applications (Linux) Some more tips: Do not learn C++ (or any other dialect of C) as a...

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 22 ·
Replies
22
Views
2K
Replies
6
Views
2K
Replies
3
Views
3K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 3 ·
Replies
3
Views
6K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 18 ·
Replies
18
Views
9K
  • · Replies 6 ·
Replies
6
Views
2K