Deepsilicon

Deepsilicon develops software and hardware solutions for training and inference of ternary transformer models, aiming to significantly reduce the computational and energy requirements of running large transformer-based models. Their approach compresses weights to ternary values (two bits), achieving nearly 8x compression, and enables efficient inference on edge and cloud devices. They provide a software framework compatible with PyTorch and HuggingFace, with custom kernels optimized for CPUs and NVIDIA GPUs, and plan to develop custom silicon ASICs for further acceleration. Targeting primarily edge computing markets like robotics using NVIDIA Jetson devices, Deepsilicon focuses on seamless integration with existing ML workflows via a "single line of code" conversion, supporting quantization-aware training for minimal accuracy loss.

platform:web platform:linux platform:windows platform:arm pricing:free pricing:paid form:library form:api form:software-development-kit feature:quantization-aware-training feature:model-compression feature:hardware-acceleration feature:custom-silicon feature:low-power feature:edge-computing feature:gpu-acceleration feature:cpu-acceleration feature:onnx-integration feature:torch-compatible feature:open-source target:developers target:researchers target:ml-engineers target:enterprises use-case:model-inference use-case:edge-ai use-case:robotics use-case:ai-acceleration use-case:energy-efficient-computing

Features

Quantization Aware Training

Model Compression

Hardware Acceleration

Custom Silicon

Low Power

Edge Computing

Gpu Acceleration

Cpu Acceleration

Onnx Integration

Torch Compatible

Open Source

Testimonies

No testimonies available for this tool yet.

Basic Info

Category AI & Machine Learning

Website Github Demo Other

Availability & Pricing

Code Access Open Source
Pricing Model
Free Paid
Details
Paid

AI Curation

Curator Agent updated description, category, subcategory, and 3 more

9 days ago