Hacker News

fniephaus
GraalPy – A high-performance embeddable Python 3 runtime for Java graalvm.org

Rochus4 months ago

In case someone is interested, here are some benchmark results comparing GraalPy and others with JDK8 using the Are-we-fast-yet benchmark suite: https://stefan-marr.de/downloads/tmp/awfy-bun.html

And here is a table representation of all benchmarks and the geomean and median overall results: http://software.rochus-keller.ch/awfy-bun-summary.ods

The implementation of the same benchmark suite runs around factor 2.4 (geomean) faster on JDK8 than on GraalPython EE 22.3 Hotspot, or 41 times faster than CPython 3.11. GraalPython is thus about 17 times faster than CPython, and about two times faster than PyPy. The Graal Enterprise Edition (EE) seem to be factor 1.31 faster than the Community Edition (CE).

masklinn4 months ago

Your mileage may very much vary, much like pypy this is very inconsistent and highly dependent on your workload (as well as your dependencies).

My limited experience was that on re-heavy workload pypy is several times slower than cpython (~3x compared to 3.10) and graal is even worse (~6x compared to 3.11).

mike_hearn4 months ago

Which version was that with? GraalVM can JIT compile regular expressions these days, with the same compiler as everything else. They implemented TRegex on top of Truffle so regex can be inlined and optimized like regular code.

Performance does indeed depend on workload. There's a page that compares GraalPy vs CPython and Jython on the Python Performance Suite which aims to be "real world":

https://www.graalvm.org/latest/reference-manual/python/Perfo...

There the speedup is smaller, but this is partly because a lot of real world Python workloads these days spend all their time inside C or the GPU. Having a better implementation is still a good idea though, because it means more stuff can be done by researchers who don't know C++ well or at all. The point at which you're forced to get dedicated hackers involved to optimize gets pushed backwards if you can rely on a good JIT.

masklinn4 months ago

> Which version was that with?

24.1. 23 may or may not have been worse, I didn’t take specific notes aside from “too slow to be acceptable”

Rochus4 months ago

That is why we should always use a standardized, controlled benchmark suite, which has well-defined rules to assure fair cross-language comparisons with a representative, well-balanced workload. By focusing on a core set of language features and abstractions, Are-we-fast-yet allows for a more controlled comparison of language implementation performance, isolating the effects of compiler and runtime optimizations.

This is especially important for scripting languages like Python, where a large part of the features are implemented in C or other native languages and called via FFI. That's why, for example, the benchmark implements its own collections, because we want to know how fast the interpreter is. Otherwise, as you have noticed, the result is randomly influenced by how much compute a particular application can delegate to the FFI.

masklinn4 months ago

> That's why, for example, the benchmark implements its own collections, because we want to know how fast the interpreter is. Otherwise, as you have noticed, the result is randomly influenced by how much compute a particular application can delegate to the FFI.

That sounds like the exact opposite of what I would want as a user of the language: the benchmark completely abstracts the actual behaviour of the runtime, claiming purported gains which don’t come anywhere near manifesting when trying to run actual software.

I’m not implementing my own collections when `dict` suffices, and I don’t really care that a pure python version of `re` runs faster in graal than in cpython, because I’m not using that.

So what happens is I see claims that graalpython runs 17 times faster than cpython, I try it out, it runs 6 times slower instead, and I can only conclude that graal is a worthless pile of lies and I should stop caring.

Rochus4 months ago

If you don't know exactly what you are measuring, the measurement is worthless. We must therefore isolate the measurement subject for the measurement, and avoid uncontrollable influences as far as possible. This is how engineering works, and every engineer should also be aware of measurement errors. In addition, repeatability and falsifiability of the experiment and conclusions are required for scientific claims. The mere statement "too slow to be acceptable" or "worthless pile of lies" is not enough for this.

A measurement method does not have to represent every practical application of the measured subject. In the present case, the measurement allows a statement to be made about the performance of the interpreter (CPython) in relateion to the JIT compiler (GraalPy). Whether the technology is right for your specific application or not is another question.

igouy4 months ago

Their point seems to be a tautology: If what you are measuring is worthless, the measurement is worthless.

Whether the measurement is in-some-sense formally correct or not is another question.

[deleted]4 months agocollapsed

jsmeaton4 months ago

Tried to use graalvm (interpreter) to run a fairly large project at my $dayjob$ and ran into a few issues right away.

  - Maturin doesn't support the graal interpreter, so no Py03 packages
  - uv doesn't seem to run, as `fork` and `execve` are missing from the os package?
  - Graal seems to have a huge number of patches to popular libraries so that they'll run, most seem to be of the form that patch c files to add additional IFDEFs
I don't think Graal is going to be a viable target for large projects with a huge set of dependencies unfortunately, as the risk of not being able to upgrade to different versions or add newer dependencies is going to be too high.

It's impressive what it does seem to support though, and probably worth looking at if you have a smaller scale project.

mike_hearn4 months ago

The number of patches is going down with time and many are trivial one liners, e.g. uvloop

https://github.com/oracle/graalpython/blob/b907353de1b72a14e...

    -        self.cython_always = False
    +        self.cython_always = True
That's the entire patch. Others are working around bugs in the C extensions themselves that a different implementation happens to expose, and can be upstreamed:

https://github.com/oracle/graalpython/blob/b907353de1b72a14e...

Still others exist for old module versions, but are now obsolete:

https://github.com/oracle/graalpython/blob/b907353de1b72a14e...

    # None of the patches are needed since 43.0, the pyo3 patches have been upstreamed
And finally, some are just general portability improvements. Fork doesn't exist on Windows. Often it can be replaced with just starting a sub-process.

So the patching situation has been getting much better over time, partly due to the GraalPy team actively getting involved with and improving the Python ecosystem as a whole.

steve_s4 months ago

There is basic GraalPy support in Maturin[0] and PyO3[1], the problem is often that packages require older Maturin/PyO3 versions and/or they use CPython-isms, semi-public APIs, etc., but it is getting better, for example [2].

It is fair to say that large projects with a huge set of dependencies will likely face some compatibility issues, but we're working on ironing this out. There is GraalPy support in setup-python GitHub action. GraalPy is supported in the manylinux image [3]. Hopefully soon also in cibuildwheel [4].

[0] https://github.com/PyO3/maturin/pull/1645 (merged)

[1] https://github.com/PyO3/pyo3/pull/3247 (merged)

[2] https://github.com/pydantic/jiter/pull/135 (merged)

[3] https://github.com/pypa/manylinux/pull/1520 (merged)

[4] https://github.com/pypa/cibuildwheel/pull/1538

jsmeaton4 months ago

Appreciate the further details, thanks! This is a huge undertaking, good luck, and I'll be checking back in here and there.

nomercy4004 months ago

To be fair, this also happened when Graal was released for Java. Give it another go in 3-6 months, the Graal team will have improved interoperability massively.

It is a chicken (interpreter) and egg (dependencies) problem. You cannot fix the dependency problems without the interpreter. Neither can you release an interpreter with full dependency support.

sitkack4 months ago

For projects using GraalPy, I'd wager that most would vendor all their dependencies at the start of the project and upgrade along the way. I have shipped a couple products with Jython, and very little 3rd party code was used and almost none of the standard library, it was all driving Java from the same project.

So it does have to do with scale but in the opposite direction. Big long projects will want to adopt something like GraalPy because of how long the project will take.

jsmeaton4 months ago

What I was hoping to be able to do was run our existing cpython project on graal to try and benefit from whatever speedups the jvm (or, if possible, compiling to a native module) would provide, rather than build with the jvm specifically in mind from the get go.

sitkack4 months ago

That is a problem with Python as a language and a platform and has nothing to do with Graal. PyPy is in the same boat. If the alternative Python's banded together there would be 10.

tannhaeuser4 months ago

I guess what makes Python interesting right now is the integration with ML toolchains, CUDA, Metal/MLX, pytorch, tensorflow, LLM encoders/decoders, etc. more than Python the language. But can GraalVM run those codes meaningfully when Python is merely used for glue code with the important bits implemented in native code?

tln4 months ago

Yes, apparently it can

https://www.graalvm.org/dev/reference-manual/python/Native-E...

> CPython provides a native extensions API for writing Python extensions in C/C++. GraalPy provides experimental support for this API, which allows many packages like NumPy and PyTorch to work well for many use cases. The support extends only to the API, not the binary interface (ABI), so extensions built for CPython are not binary compatible with GraalPy. Packages that use the native API must be built and installed with GraalPy, and the prebuilt wheels for CPython from pypi.org cannot be used. For best results, it is crucial that you only use the pip command that comes preinstalled in GraalPy virtualenvs to install packages. The version of pip shipped with GraalPy applies additional patches to packages upon installation to fix known compatibility issues and it is preconfigured to use an additional repository from graalvm.org where we publish a selection of prebuilt wheels for GraalPy. Please do not update pip or use alternative tools such as uv.

fniephausop4 months ago

For anyone interested, here's the PyPI repository with additional binary wheels for GraalPy: https://www.graalvm.org/python/wheels/

We also want to make it easy for Python package maintainers to test and build wheels for GraalPy. It's already available via setup-python, and we are adding GraalPy support to cibuildwheel. If you need any help, please reach out to us!

theLiminator4 months ago

I wonder if hpy will solve the extension problem.

RMPR4 months ago

While hpy is great and I'm excited about it, I would rather bet on the limited C API[0] (which is basically what hpy tries to be if I understand correctly).

0: https://devguide.python.org/developer-workflow/c-api/#limite...

steve_s4 months ago

Limited C API is not as abstract as HPy. Most notably Limited C API still exposes reference counting as memory management mechanism, HPy abstracts that. However, ecosystem wide adoption of limited C API and stable ABI would already improve things significantly.

mike_hearn4 months ago

Yes, that's the idea.

pjmlp4 months ago

I am willing to live with Python as the Lisp we deserve to have, on this AI wave, when it finally gets a proper JIT story we can rely on, regardless of the workload.

Currently it is a mix and match of an herculean engineering effort mostly ignored by the community (PyPy), DSLs for GPGPUs, bunch of C and C++ libraries that people keep referring to as "Python" when any language can have similar bindings, jython, IronPython, GraalPy,...

So it isn't for lack of trying, at least we finally have CPython folks more welcoming to performance improvements, and JITs.

fastball4 months ago

The problem with PyPy is that it doesn't support the C-API, which is required for all those other high performance libraries.

So you gain the perf of a JIT, while losing out on most everything else high-performance in the Python ecosystem.

masklinn4 months ago

Pypy has cpyext which implements a subset of the C-API, however it comes with a long list of caveats, and is more of a backstop, they very much prefer cffi.

RMPR4 months ago

> I am willing to live with Python as the Lisp we deserve to have

You can have your cake and eat it too https://github.com/hylang/hy

yosefk4 months ago

The reasons for all this stuff having been developed in Python also make Python interesting right now, all by themselves. It did not happen by accident; this stuff was developed fairly recently and there was no shortage of mature languages to choose from.

BiteCode_dev4 months ago

The people disliking the language are very vocal about it, but there is a huge amount of silent people that loves it and an even bigger amount that just like it as much as alternatives. It's mainstream now, not trending like 10 years ago, so there is no hype about it anymore. We just use it to do stuff.

Add to that the existing excellent ecosystem, the strong culture of scientific stacks and a very good story for providing c-extentions (actually the best one in all scripting languages because of things like cibuildwheel).

It's only in small tech bubbles like HN that devs find it surprising.

pm904 months ago

Python has many issues that are quite clear when you operate at some kind of scale and need proper multiprocessing/multithreading support. And its not just the GIL, you get very unexpected behaviors when dealing with exit handlers and signal handlers in edge cases. Having seen what other languages look like it just doesn’t feel like a language that was designed well for running at scale.

The tooling has markedly improved though. Things like typing and compile time checks, great. But its also funny to me that some of the fastest tools for python are being built in rust (eg uv).

devjab4 months ago

I’ve always found Python to be sort of loved on HN. Not by everyone or course but I guess it depends on each of our experiences on here. I’m usually rather surprised when I meet people who genuinely dislike Python, because that seems like such an odd occurrence. Even if people don’t “love” the language most people seem to have had rather fond experiences or memories of it. Usually criticism comes down to its inefficiencies, but those aren’t exactly unreasonable critiques.

As I said it’s anecdotal, but in my experience Python gets a lot of love compared to something like Java or C#. Both of which are often met with real harshness. Hell I’ve ranted unseemly about C# myself.

lordgroff4 months ago

I mean, even on HN, I'd say if there's derision, it's mostly one uttered with a yawn rather than genuine hate. And that's almost justified; while I spend a lot of my time with lots of different languages (I can't think of a single one I outright hate btw), Python is the one that pays for my things and... Well, there's not much drama there is there (now that we're lost 2->3 anyway)? It's a glue language that's easy to learn, but offers tons of depth should you want it. My primary annoyance at Python used to be the typing, but type annotations have made this less of an issue. It's a nice language and you can do almost everything with it. It's a bit boring, but I guess that's a good thing.

pjmlp4 months ago

My only big critism is the CPython folks resistance to any kind of performance improvements, and the way PyPy efforts have been largely ignored, making Python the last major dynamic language to finally start caring about performance and having a JIT in the box.

Finally thanks to data science, and people getting fed up with always writing bindings, this is changing, and Python can join Common Lisp, Scheme, Smalltalk, SELF, JavaScript, Ruby, Lua, Dylan, Julia, BASIC club.

neonsunset4 months ago

To be fair, Ruby YJIT can still somehow be slower than Python 3.11+ at times. Lua JIT is a bit of a separate effort and the ones that truly try the hardest here are Julia (which is not necessarily interpreted, it's really difficult to call it a pure scripting language at this point) and JavaScript engine implementations. I can see pretty good numbers for SBCL on BenchmarksGame too.

pjmlp4 months ago

Yes, but it isn't the only one, the oldest Ruby JIT goes back to Ruby Motion, still being sold.

I would not put Julia and Common Lisp on the scripting basket yep.

gpderetta4 months ago

and elisp. Let's not forget that elisp got a JIT (well, actually a static compiler) before CPython did!

pjmlp4 months ago

Indeed, I only listed the languages I know more about, it has been a long time since XEmacs was my poor man's IDE on UNIX systems.

gpderetta4 months ago

The people disliking the language might also be the people that love it. I know I am in that camp.

It is a great language in many ways, that's why its shortcomings are so painful.

wenc4 months ago

As a former Perl hacker who started using Python in 2005, I saw Python ride several waves. (Numerical computation, data science, deep learning)

Perl was the leading tool for scripting and text parsing. Python didn’t really supplant it for a long time — until people started writing more complicated scripts that had to be maintained. Perl reads like line noise after 6 months whereas I can look at Python code from 20 years ago, prettify it with black, and understand it.

Python got picked up by the scientific computing community, which gave it some its earliest libraries like numpy, f2py, scipy. Some of us who were on MATLAB moved over.

Then data science happened. Pandas built off the scientific computation foundations and eventually libraries like scikit and matplotlib (mimicking matlab’s plotting) came along.

Then tensorflow came along and built on the foundation of numerical libraries. PyTorch followed.

Other systems like Django came and made python popular for building database backed websites.

Suddenly there was momentum and today almost all numerical software have a python API — this includes proprietary stuff like CPLEX and what have you.

Python was the glue language that had the lowest barrier of entry. For instance, Spark was written in Scala and has a performant Scala API but everyone uses PySpark because it’s much more accessible, despite the interop cost.

The counterfactual to all this was Ruby. It had much nicer syntax than Python but when I tried to use it in grad school I was quickly stymied by the lack of numerical libraries. Ruby never found a niche outside of Rails and config management.

Essentially Python — like Nvidia today — bet on linear algebra (and more broadly on data processing) and won.

I get why there’s hate for Python — it’s not a perfect language. Yet those of us pragmatists who use it understand the trade offs. You trade off on the metal performance for programmer performance. You trade off packaging difficulties for something that works. You trade off an imperfect syntax for getting things done.

I could have used Ruby — a much more beautiful lanaguage — in grad school and worked around its lacks, but I would have not graduated on time. Python was pragmatic choice for me and continues to be one for me today (outside of situations requiring raw performance)

commodoreboxer4 months ago

I agree with you, and I'll put it slightly stronger. Ruby is a better language than Python in every way except the very most important two:

- Imports in Ruby seriously suck compared to Python. Everything requires into a global scope and an ecosystem like bundler which encourages centralizing all imports for your entire codebase into one file.

- Python has docstrings encouraging in code documentation.

Add common ecosystem things like the Ruby community encouraging generated methods, magical "do what I mean" parameters, and REPL poke-driven development, and this leads to the effect that Python codebases are almost always well documented and easy to understand. You can tell where every symbol comes from, and you can usually find a documentation entry for every single method. It's not uncommon for a Ruby library, even a popular one, to be documented solely through a scattering of sparsely-explained examples with literally no real API documentation. Inheriting a long-lived Ruby project can be a serious ordeal just to discover where all the code that's running is running, why it's running, where things are preloaded into a builtin class, and with Rails and Railties, a Gem can auto insert behavior and Middleware just by existing, without ever being explicitly mentioned in any code or configs other than the Gemfile. It's an absolute headache.

My dream language would be Ruby with Python-style imports and docstrings.

Myrmornis4 months ago

I think your comment needs to mention that Python has syntax for type annotations and two mature type checkers (mypy and pyright) with more under development. Python is thus very much part of the modern statically typed languages scene (moreso than Go) whereas Ruby isn't at all. Many people wouldn't touch Python today if it weren't for this.

pansa24 months ago

> Python is thus very much part of the modern statically typed languages scene (moreso than Go)

Python’s type system is substantially more complex than Go’s - it’s probably more complete, but given it’s optional nature, less sound.

In “modern” type systems, is completeness considered more important than soundness? The success of TypeScript suggests it is.

kaba04 months ago

Since basically every single type system has escape hatches (casts), yes, I would say completeness is more important than soundness.

pansa24 months ago

> two mature type checkers

I’ve never quite understood how this works. Surely a type system is absolutely fundamental to a language - how can you have multiple incompatible ones?

Do you need to choose a particular type checker for each project? Are you limited to only using third-party libraries that use the same type checker?

t435624 months ago

I think Python was successful because it started off without a type system and you can still choose not to use it. Duck typing is the big feature really.

It might float your boat to think about types but why would everyone have to want the same thing?

baq4 months ago

Look at JavaScript and typescript - Python’s typing is maybe halfway to that gold standard but there were other typed languages based on js. Python is special in that it provides type hinting syntax which is not used by the interpreter, so writing types doesn’t require the Byzantine build systems of js.

RMPR4 months ago

> syntax for type annotations and two mature type checkers (mypy and pyright)

I would throw Pyre in there too

antod4 months ago

Hard agree on the global Ruby import issues. I remember inspecting large custom Rails or Capistrano codebases in pry and having thousands of names imported. That and monkey patching had me wishing for Python with imports only having module scope and being a lot more explicit.

nextos4 months ago

It's a shame Python has a strong anti-FP stance with crippled lambdas. And an OO system that looks like it has been bolted in, compared to Ruby which is essentially a Smalltalk with Perl-like syntax and some Lisp influence.

These two issues would have been quite easy to fix and would have led to a completely different development experience. Python had a good implementation with a nice C FFI (CPython) right from the beginning, whereas Ruby MRI had lots of efficiency issues with long-running computations. IMHO this is one of the reasons why Python won. Building a numerics stack on top of MRI did not look very promising.

pansa24 months ago

> an OO system that looks like it has been bolted in, compared to Ruby

I think the two languages just have different design philosophies. In Python, functions are fundamental and classes are built on top of them. In Ruby, objects are fundamental and functions (i.e. Procs etc) are themselves objects.

You could just as well claim that in Ruby, functions look like they have been bolted in. For example, you can’t call a Proc itself but need to call one of its methods.

pmontra4 months ago

I agree. Python was designed in 1989 and it looks like the OOP we were doing in C (without the ++) back at the time. Objects were a struct with data and function pointers and we were passing them around as pointers. Python has self, explicit in function definition and implicit in function calls, and that self is really like the pointer to the struct. By the way, OO languages from the 90s (e.g. Java and Ruby) were designed to always hide that self, both in method definition and method call. They use it when there is a need to tell the difference between instance attributes and local variables with the same name.

Maybe the explicit self was there to make C programmers feel at home. Functions as fundamental building blocks of the language also make C programmers feel at home. Developers got more familiar with OOP by mid 90s so the new languages could jump from functions-first to objects-first.

nextos4 months ago

I think Python's OO is a bit suboptimal even if the goal was to have method-centric OOP like in C with classes. For example, mechanisms to hide information, a fundamental part of the OO paradigm, are hacky. You need to use name mangling.

Same applies to FP, a few things are weird and crippled. IMHO, the net result is that Python code tends to look longer and much more algorithmic than in Ruby, Smalltalk or various Lisps, where the language favors lots of little functions that call each other.

Things are changing a bit, though. For example, pattern matching (PEP 622) brings some conciseness. Fixing those other issues would be great.

maple31424 months ago

Isn't Python's functions are just objects with a __call__ method, and such objects has a syntax sugar allowed them to be called like a function.

baq4 months ago

Functions are objects are functions are objects… heard that from a little schemer

kfrzcode4 months ago

Pragmatic use of $LANGUAGE is a telltale sign of the wizened programmer; one who understands the use-case and solution set well enough to know when the tool fits.

I wrote Ruby when I got started because it was the most accessible and the Rails learning content was top notch. Now I use python when I need more than a few `bash` pipes to accomplish anything, but if I were to solve a capital-P Problem, of course the tool often chooses the project after constraints.

o11c4 months ago

There was also the major anti-wave of Python 3. But it has managed to pull through despite ending up with broken strings (RIP all old code that needs to deal with legacy-encoded data), probably because there was no viable replacement.

wenc4 months ago

Python 3 was a painful episode and I lingered on 2.7 and only ported over around 3.6.

But now 3.11 is fine again. Looking forward to faster releases.

pjmlp4 months ago

As someone that did the Perl to Python transition back in 2003, for UNIX scripting tasks, the way to do OOP with packages and blessed references was clunky, and having to always go back to the manuals for some clever programming tricks from team mates was tiresome, while Python provided something nicer, and I wasn't really into the sed/awk like features in Perl anyway.

However due to being a interpreted scripting language I never bothered to use Python for anything beyond OS scripting.

pjmlp4 months ago

Using Python as C and C++ REPL of sorts has been common in academia since it took the scripting crown away from Perl and Tcl, which were used during the late 90's.

Example see the Bioinformatics papers from that period, and the Perl tooling used alongside the research.

Already in 2003 CERN was using Python on some of their build infrastructure (see CMT), Grid Computing scripting efforts, and we had Python trainings available to us.

Now there is a difference between a REPL of sorts, scripting OS tasks, and going full blown applications with a pure interpreter.

Eridrus4 months ago

It didn't happen by total accident, but it didn't happen by design for where we are today either. The original choice to start building data science tooling in Python happened intentionally, but since then path dependence has been a huge thing.

waldrews4 months ago

Looks like all of that would run in a native sandbox environment which in turn is called from the Python running on the JVM. So, maybe it simplifies interop, but whether it's straightforward to get full performance from the native layer (especially GPU/multicore) is an open question.

fniephausop4 months ago

OP here.

More details about this particular release are in the blog post at https://medium.com/graalvm/whats-new-in-graal-languages-24-1...

Happy to answer any additional questions!

nurettin4 months ago

Hi, what's the deployment process like? Is there a program similar to warbler (for jruby) that builds a jar for a python program?

EDIT: I tried the native binary command here on a simple hello world script.

It downloaded some stuff in the background, built the entire python and java and embedded it into a 350 MB ELF binary on linux after 15 minutes of using 24 GB RAM and 100% CPU.

But I'd much prefer a smaller jar file which I can distribute cross-platform.

https://www.graalvm.org/uploads/quick-references/GraalPy_v1/...

fniephausop4 months ago

Thanks for the question, nurettin.

Although GraalPy can create standalone applications [1], you don't have to turn your hello world script into a self-contained binary. You can, of course, create a JAR that depends on GraalPy, or a fat JAR that contains it, and deploy it just like any other Java application.

We are still updating our docs to mention more details on this and publish some guides, apologies for the delay.

[1] https://www.graalvm.org/latest/reference-manual/python/stand...

upghost4 months ago

FWIW we've had full Java/Python integration in Clojure for awhile now, courtesy of Chris Neurnberger and libpython-clj: https://github.com/clj-python/libpython-clj

If you're into that sort of thing.

Self-interest disclosure: I'm a major contributor and heavy user.

waldrews4 months ago

What's the GIL/threading story there?

upghost4 months ago

I'm assuming you mean "how well does JVM concurrency play with Python concurrency"? Python concurrency works perfectly well on its own, Java/Clojure concurrency works very well on its own, trying to pass multithreaded information across the JVM boundary to Python while bypassing the GIL will result in a segfault (Edit: but there are "with-gil" wrappers you can use to prevent that, at a slight performance hit). In practice this tends not to be much of a problem as you setup a parallel workload on one side of the boundary or the other and pass information with a threadsafe queue. We do plenty of heavy parallel computations, data science, AI, fintech, etc.

There are certainly some leaky abstractions and there is a general expectation that you understand the quirks of Python and Clojure pretty well, so it's not for everyone. Knowing something about Java would probably help too but I've been using libpython-clj in production since 2017 years and I barely know anything about Java (compared to Python/Clojure).

malux854 months ago

This is pretty interesting, what's the benefit over using python so directly with java? I mean, is the overhead of having these as seperate services / processes too much? I'm not trying to provoke I'm genuinely curious about the use case.

Also, what's the dev workflow like? When I'm coding python I basically live inside the debugger (a.k.a the carmark method), do you use an IDE that understands both java and python? Whats the debugging experience like? Can you set a breakpoint and then evaluate python code and expressions inside the debugger like you can if it was just solely a python project using VSCode and the python debugger?

upghost4 months ago

Oh sorry there are actually huge performance benefits over a services based approach, because you're using the same memory space instead of serializing. This is particularly enormous for ML and data science space because of the work Chris did on hyper efficient mapping zero copy mapping of numpy arrays to tech.ml.dataset tensors.

Not even GraalVM has that! Not yet, anyway.

So there's a lot of easy perfomance synergies over microservices, but I'm the kind of dev where I tend to prioritize fun over performance as long as it's "performant enough". Fortunately, Chris (author of libpython-clj) is an ex-Nvidia performance obsessed dev though so the performance there is on point.

upghost4 months ago

That's a really interesting line of questioning! We have a mode called "embedded mode" where you run the python application first, THEN initialize the JVM and Clojure via the Python "javabridge" package. From there, you can start your Clojure REPL and experience both Clojure's IDE integrated REPL experience or the Python debugger, depending on how you set it up. This also allows you to run maximally complex Python applications and is the recommended approach for training ML robustly.

I also tend to live inside the debugger for some things but for other things I really enjoy the Clojure/lisp style "in editor evaluation" (where the result appears right after your cursor when you evaluate the S-expression).

The usescasses question is a good one. Python has some pretty good libraries. For one project, we have a (Clojure) ring server and GCP cloud resources. Using the Python GCP secret manager to access protected cloud resources allows me to have the same code in dev and prod with minimal configuration.

Also sometimes it's just political. Maybe your workplace is a Clojure/Java only shop -- in that case, sometimes you can make the case Python is "just a library" and get some cool toys, in other circumstances where its Python only you can at least dev using your lisp REPL.

So if that kind of thing sounds fun to you (and you like emacs) you'll like this. If that sounds like hell to you, then it is!! I really tried hard to optimize around "fun" for the API, but it's also really performant, and great fun for hacking.

In particular I really love doing silly stuff with Python LLMs in the Clojure REPL.

So, tl;dr, I'd say it is really great if you are a certain kind of hacker who wants all the most fun toys and as an added bonus it also works in production.

malux854 months ago

Thank you for the very thorough answer :)

wenc4 months ago

DuckDB is not currently a supported package, but Pandas and matplotlib are which is good. If DuckDB and Polars were supported and if they ran well, I suspect many data jobs could benefit.

rsyring4 months ago

Why would they benefit? When duckdb/Polaris are being used correctly, all the work is happening in the native stack. It should already be very fast compared to the Python runtime.

I recently moved a large ETL process that was mostly Python runtime processing to pyarrow/Polaris and wrote all the ETL logic in SQL. I've seen processes that used to take a week to run drop to about an hour (no exaggeration).

wenc4 months ago

They wouldn’t benefit from performance because as you say they are already blazing fast as is. And I know what you mean — I rewrote a pure (granted old pre-2.0) pandas transformation into duckdb and compute time dropped from nearly an hour to single digit minutes.

But having these in Graal would allow more types of applications to be deployed in JVM stacks. As sibling comments note, many data science models are in python but production stacks are in Java.

rsyring4 months ago

> But having them this would allow more types of applications to be deployed in JVM stack...

Ah...makes sense now. I was thinking along the lines of someone switching to the JVM for better performance, but being held back by the absence of those libraries.

sevensor4 months ago

Took a little digging to find that it targets 3.11. Didn’t see anything about a GIL. If you’re a Python person, don’t click the quick start link unless you want to look at some xml.

pjmlp4 months ago

Python implementations naturally don't have any GIL in regards to JVM or CLR variants, there is no such thing on those platforms.

YAML and JSON have both tried to replicate the XML tooling experience, only worse.

Schemas, comments, parsing and schema conversions tools.

lopuhin4 months ago

I think GraalPython does have a GIL, see https://github.com/oracle/graalpython/blob/master/docs/contr... - and if by "there is no such thing on those platforms" you mean JVM/CLR not having a GIL, C also does not have a GIL but CPython does.

pjmlp4 months ago

My mistake, as I assumed they took the same decision as jython and IronPython.

https://jython.readthedocs.io/en/latest/Concurrency/#no-glob...

https://wiki.python.org/moin/IronPython

The difference between JVM, CLR and C in regards to parallel and concurrent code is that they are built for those kind of workloads, and have a memory model proper, hence not needing a GIL.

commodoreboxer4 months ago

I think they would have to here, to support native modules. Jython (and I believe IronPython, but don't quote me) does not support native CPython modules. CPython modules explicitly control the GIL, so if they are supported (as they are here), you can't really leave the GIL out without exposing potential thread safety issues.

westurner4 months ago

"PEP 703 – Making the Global Interpreter Lock Optional in CPython" (2023) https://peps.python.org/pep-0703/

CPython built with --disable-gil does not have a GIL (as long as PYTHONGIL=0 and all loaded C extensions are built for --disable-gil mode) https://peps.python.org/pep-0703/#py-mod-gil-slot

"Intent to approve PEP 703: making the GIL optional" (2023) https://news.ycombinator.com/item?id=36913328#36917709 https://news.ycombinator.com/item?id=36913328#36921625

kaashif4 months ago

This is pretty beside the point. The point is that X not having a GIL doesn't inherently mean Python on X also doesn't have a GIL.

westurner4 months ago

CPython does not have a GIL Global Interpreter Lock GC Garbage Collection phase with --gil-disabled. GraalVM does have a GIL, like CPython without --gil-disabled.

How CPython accomplished nogil in their - the original and reference - fork is described in the topical linked PEP 703.

kaashif4 months ago

Yes, I know. What I'm saying is that:

It's possible to have a language that doesn't have a GIL, which you implement Python in, but that Python implementation then has a GIL.

The point being that you can't say things like: Jython is written in Java so it doesn't have a GIL. CPython is written in C so doesn't have a GIL. And so on.

If this isn't clear, I apologize.

westurner4 months ago

Oh okay. Yeah I would say that the Java GC and the ported CPython GIL are probably limits to the performance of any Python in Java implementation.

But are there even nogil builds of CPython C extensions on PyPi yet anyway.

Re: Ghidraal and various methods of Python in Java: https://news.ycombinator.com/item?id=36454485

jitl4 months ago

Happily, you can ignore the Maven XML and use Gradle instead, it's the next codeblock on the page, after "or":

    implementation("org.graalvm.polyglot:polyglot:24.1.0")
    implementation("org.graalvm.polyglot:python:24.1.0")

vips7L4 months ago

Gradle files are less verbose than the equivalent Maven pom.xml but Gradle tends to have other issues like: complex builds that are hard to maintain, not running on the latest JVM version without some wait time, and constantly breaking because Gradle makes breaking changes every release. I'm hoping the declarative Gradle experiment [0] helps with this.

Additionally if XML isn't your thing Maven is making a push for other formats in Maven 4 like HOCON [1].

[0] https://blog.gradle.org/declarative-gradle-first-eap [1] https://github.com/apache/maven-hocon-extension

foobazgt4 months ago

I mean, if you're trying to embed one language in another, please don't be surprised when the quickstart guide has a couple of examples containing a few lines of code written for the embedding language and its package manager(s).

[deleted]4 months agocollapsed

abernard14 months ago

An honorific. So much of this dynamic language performance improvement on the Graal JVM was proven out by Chris Seaton.

May he rest in peace.

nkzd4 months ago

What is the use-case for GraalPy? To be honest I don't understand why would anyone want to use it.

andreldm4 months ago

I worked at a company where data scientists wrote python code using pandas and we had port it to java and a library called keanu that was very useful but soon became unmaintained.

Of course this was very time consuming and unrewarding, all because only java applications could be deployed to production due to a stupid top-down decision.

This GraalPy sounds like something I wish existed back then.

hobofan4 months ago

jep[0] has existed for a while now, and does what GraalPy is doing quite well.

I'm using it for similar purposes as you stated and for that it works quite well. A research group I am collaborating with does a lot of their work in one Java application (ImageJ for microscopy), so by integrating my Python processing code into that application, it finds its way a lot quicker into the daily workflows of everyone in that group.

Most recently I've also extended the jep setup to include optional Python version bootstrapping via uv[1], so that I can be sure that the plugins I'm writing have the correct Python version available, without people having to install that manually on the machine.

[0]: https://github.com/ninia/jep

[1]: https://github.com/astral-sh/uv

[deleted]4 months agocollapsed

pvorb4 months ago

Did you look into Jython back then?

toyg4 months ago

Jython has historically lagged hard, often falling behind for very extended periods. For a time their releases basically just stopped, which led to them missing support for pretty much anything between 2.7 and 3.6 (iirc). I know the project basically rebooted at some point, but I've since lost interest.

RMPR4 months ago

Not to mention the biggest drawback imho. Those alternative implementations don't support C extensions.

jsight4 months ago

Jython was dead for a long time. It might be back a little now, but there is still no Python 3 support.

GraalPy is much more active and more compatible.

andreldm4 months ago

Not me, someone else in the company did, I don’t remember why it was dismissed.

chc44 months ago

Ghidra embeds Python scripting via Jython, which is stuck on Python 2. Switching to GraalPy would allow Python 3 scripting.

Any other Java programs that want a scripting engine could use it as well.

kaba04 months ago

Besides all the nice answers given by others, a big one was not mentioned: performance!

Graal can do pretty advanced JIT-compilation for any Graal language, plus you can mix-and-match languages (with a big chunk of their ecosystems) and it will actually compile across language boundaries. And we haven’t even mentioned Java’s state of the art GCs that can run circles around any tracing GC, let alone the very low throughput reference counting.

ackfoobar4 months ago

I guess for pure python applications, they'd rather throw more hardware at the problem than messing with the JVM.

kaba04 months ago

For serial workloads it’s very very hard to scale by hardware, though. CPUs aren’t getting 2x faster as they used to.

Also, what is “messing with the JVM”? That’s like one of the most battle tested technologies out there, right next to the Linux kernel.

ackfoobar4 months ago

Don't get me wrong, I love the JVM.

The unfortunately common irrational aversion to JVM aside, there's also the fear of "using it wrong".

theflyinghorse4 months ago

Picture working for a big, non-tech corporation. Your BU only does Java because it has always been thus and Jeff the SVP is a law grad and doesn't want anything to change because of perceived risk. GraalVM allows smart people who have to work within such limitations to still write (mostly) the software they want while still vaguely relating it to Java for decision makers.

nunobrito4 months ago

Those "smart people" write blackboxes in esoteric languages that only the same person maintains.

Everyone else has to write wrappers to interact with that blackbox. God forbid someone daring to even change the code, because it basically doesn't even need/use junit tests. Eventually the smart person gets bored and moves to something else, that tool then gets rewritten to Java in two days by someone else.

End of story.

actionfromafar4 months ago

Not so vaguely, either. The dev story is not Java but the deploy story is.

abirch4 months ago

Minecraft Mods can only be written in Java and I want my kid to learn python.

Jython is still 2.x and it'd be nice to let my kid write a minecraft mod in python. Not a business use case but a use case.

smj-edison4 months ago

When I was learning programming, my coding class used a Bukkit plugin that connected to Python. I can't remember what it was called, but that was for Minecraft 1.7.10.

Not sure if you were wanting Python specifically, but KubeJS lets you use JavaScript for mods. I think there's also a clojure integration.

abirch4 months ago

Thank you. My 3rd grader knows basic python so I'd prefer to stick with that or Scratch

pvorb4 months ago

Maybe this would be an interesting alternative runtime environment for PySpark? I think currently PySpark runs in Python and somehow interacts with a JVM and relies on copying data from one to the other.

xyproto4 months ago

Data scientists trapped in bureaucracy?

the_arun4 months ago

I am assuming - With this, JVMs needing integration with LLMs can embed LLMs in JVM instead of making outbound API calls. If my assumption is right - wouldn't this improve performance of consumer applications?

pjmlp4 months ago

Thankfully some LLMs also have Java bindings to the same native libraries used by Python.

theanonymousone4 months ago

Does it have to be run in a GraalVM, or any JVM implementation is fine?

Okx4 months ago

> You can use GraalPy with GraalVM JDK, Oracle JDK, or OpenJDK

https://www.graalvm.org/latest/reference-manual/python/

theanonymousone4 months ago

Thanks. I actually managed to run the quick example with Temurin Java 22. Maybe that is what they mean by "OpenJDK": java.vm.name=OpenJDK 64-Bit Server VM, java.vendor.version=Temurin-22.0.2+9

ackfoobar4 months ago

theanonymousone4 months ago

Update. I actually managed to run the quick example with Temurin Java 22: java.vm.name=OpenJDK 64-Bit Server VM, java.vendor.version=Temurin-22.0.2+9

mike_hearn4 months ago

It won't JIT compile on anything other than GraalVM however. So it'll run, but slowly.

ackfoobar4 months ago

The documentation says "Optimized if enabled via experimental VM option". (I linked in another thread.)

jryan494 months ago

Graal let's you compile native binaries

ackfoobar4 months ago

Graal is many things (a marketing nightmare). The guest language part is orthogonal to the native packager AFAIK.

w10-14 months ago

Yes, but I was under the impression that graal-level inter-op was limited to packages the graal toolchain could compile.

Thus, while swift and graal both depend on llvm, they use different variants and there's no real way to make inter-op between swift and graal (even using the llvm it which graal is said to be able to consume).

e.g., I believe this announcement represents the work to compile a python (3.11) and some proof-of-concept python packages using graal toolchain, to spur other packages to support the same.

So I'd really love to be wrong, but I believe building under the graal llvm is the common factor.

kaba04 months ago

I don’t really see how swift comes into the picture, besides SuLong being a thing (running LLVM bitcode). Native binary was meant as a compile target in the previous comment, I believe, not as an input. Graal can do both, but as a target it has no dependency on LLVM.

So yeah, graalvm should be able to produce a native binary for python code (though depending on the specifics it might actually be more like a native binary interpreter running python scripts, it can’t optimize in every circumstance but I’m hazy on the details).

cout4 months ago

Could this directly invoke Java (or Scala) functions without using a bridge? If so this would be great for programs that use spark -- UDFs would become performant enough to consider using on medium-to-large dataframes.

mkoubaa4 months ago

HPy can eventually be used to support CPython extension modules in GraalPy

ajdhGfa4 months ago

And they will run how much slower or have strange bugs?

nprateem4 months ago

Am Internet point for the first working demo with django + postgres.

[deleted]4 months agocollapsed

pantulis4 months ago

GraalVM is fascinating. Honest question: what are Oracle's plan for it? How does it serve them?

mk894 months ago

There is already an EE for it, so I guess they provide basic functionalities for free, and if you need additional features you have to pay?

mike_hearn4 months ago

EE is mostly free to use actually (check the licensing FAQ for exactly when). Graal features get integrated into Oracle products and make them better.

ajdhGfa4 months ago

I'm very skeptical about production use, but the thought of Oracle taking over Python is amusing, since the Python community is already run like Oracle in a top down military manner. It can only get better!

calrizien4 months ago

Is there a way to embed Python 3 into Swift like this?

w10-14 months ago

I haven't seen embedding using graal/vm, or inter-op using the native JVM FFI.

There is (active, 2K stars) https://github.com/pvieito/PythonKit and I've heard of people being able to deploy apps with python on the app store. YMMV.

froh4 months ago

what's the advantage of this over JPype?

mdaniel4 months ago

That it goes in the opposite direction of your cited project (run modern-ish python from within the JVM), and almost certainly has a much, much better JIT story than yours

[deleted]4 months agocollapsed

iLemming4 months ago

What does that mean for Clojure?

positr0n4 months ago

Same thing as Java. You could use this to run python in your clojure JVM process.

masklinn4 months ago

Why would it mean anything for clojure?

Abismith4 months ago

[dead]

Abismith4 months ago

[dead]

wetpaws4 months ago

[dead]

2OEH8eoCRo04 months ago

[flagged]

pjmlp4 months ago

In places where it actually has a JIT.

mkoubaa4 months ago

The one we want to live in while you fling poop

hn-front (c) 2024 voximity
source