Hacker News

Compactifai
CompactifAI Inference API

Hi HN,

We’ve been working on model compression and have deployed our compressed models—as well as the original versions—on our AWS cluster, accessible via an inference API. We’d love feedback from developers on the integration experience and overall behavior. If you’d like to try it, email [email protected] and we’ll send an API key and brief docs. We’re granting access to a limited number of users for a 3-month period with no usage caps, so you can run real workloads during that window.

Model benchmarking is available here: https://artificialanalysis.ai/providers/compactifai


hn-front (c) 2024 voximity
source