Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .github/instructions/cython.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
applyTo:
- "dpctl/**/*.pyx"
- "dpctl/**/*.pxd"
- "dpctl/**/*.pxi"
---

# Cython Instructions

See `dpctl/AGENTS.md` for full conventions.

## Required Directives (after license)
```cython
# distutils: language = c++
# cython: language_level=3
# cython: linetrace=True
```

## Key Rules
- `cimport` for C-level, `import` for Python-level
- Store C refs as `_*_ref`, clean up in `__dealloc__` with NULL check
- Use `with nogil:` for blocking C operations
26 changes: 26 additions & 0 deletions .github/instructions/dpctl.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
applyTo:
- "**/*.py"
- "**/*.pyx"
- "**/*.pxd"
- "**/*.cpp"
- "**/*.hpp"
- "**/*.h"
---

# DPCTL General Instructions

See `AGENTS.md` at repository root for project overview and architecture.
Each major directory has its own `AGENTS.md` with specific conventions.

## Key References

- **Code style:** `.pre-commit-config.yaml`, `.clang-format`, `.flake8`
- **License:** Apache 2.0 with Intel copyright - match existing file headers

## Critical Rules

1. **Device compatibility:** Not all devices support fp64/fp16 - never assume availability
2. **Queue consistency:** Arrays in same operation must share compatible queues
3. **Resource cleanup:** Clean up C resources in `__dealloc__` with NULL check
4. **NULL checks:** Always check C API returns before use
24 changes: 24 additions & 0 deletions .github/instructions/elementwise.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
applyTo:
- "dpctl/tensor/libtensor/include/kernels/elementwise_functions/**"
- "dpctl/tensor/libtensor/source/elementwise_functions/**"
- "dpctl/tensor/_elementwise_*.py"
- "dpctl/tests/elementwise/**"
---

# Elementwise Operations Instructions

Full stack: C++ kernel → pybind11 → Python wrapper → tests

## References
- C++ kernels: `dpctl/tensor/libtensor/AGENTS.md`
- Python wrappers: `dpctl/tensor/AGENTS.md`
- Tests: `dpctl/tests/AGENTS.md`

## Adding New Operation
1. `libtensor/include/kernels/elementwise_functions/op.hpp` - functor
2. `libtensor/source/elementwise_functions/op.cpp` - dispatch tables
3. Register in `tensor_elementwise.cpp`
4. `_elementwise_funcs.py` - Python wrapper
5. Export in `__init__.py`
6. `tests/elementwise/test_op.py` - full dtype/usm coverage
23 changes: 23 additions & 0 deletions .github/instructions/libsyclinterface.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
applyTo:
- "libsyclinterface/**/*.h"
- "libsyclinterface/**/*.hpp"
- "libsyclinterface/**/*.cpp"
---

# C API Instructions

See `libsyclinterface/AGENTS.md` for conventions.

## Naming
`DPCTL<ClassName>_<MethodName>` (e.g., `DPCTLDevice_Create`)

## Ownership annotations (see `include/syclinterface/Support/MemOwnershipAttrs.h`)
- `__dpctl_give` - caller must free
- `__dpctl_take` - function takes ownership
- `__dpctl_keep` - function only observes

## Key Rules
- Annotate all parameters and returns
- Return NULL on failure
- Use `DPCTL_API` for exports
16 changes: 16 additions & 0 deletions .github/instructions/libtensor-cpp.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
applyTo:
- "dpctl/tensor/libtensor/**/*.hpp"
- "dpctl/tensor/libtensor/**/*.cpp"
---

# C++ SYCL Kernel Instructions

See `dpctl/tensor/libtensor/AGENTS.md` for patterns and directory structure.

## Key Rules
- Kernel class names must be globally unique
- Use `if constexpr` for compile-time type branching
- Complex types don't support vectorization
- Return `nullptr` from factory for unsupported types
- Check `include/kernels/elementwise_functions/common.hpp` for base patterns
19 changes: 19 additions & 0 deletions .github/instructions/memory.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
applyTo:
- "dpctl/memory/**"
- "**/test_sycl_usm*.py"
---

# USM Memory Instructions

See `dpctl/memory/AGENTS.md` for details.

## USM Types
- `MemoryUSMDevice` - device-only (fastest)
- `MemoryUSMShared` - host and device accessible
- `MemoryUSMHost` - host memory, device accessible

## Lifetime Rules
1. Memory is queue-bound
2. Keep memory alive until operations complete
3. Views extend base memory lifetime
19 changes: 19 additions & 0 deletions .github/instructions/tensor-python.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
applyTo:
- "dpctl/tensor/*.py"
- "dpctl/tensor/**/*.py"
---

# Tensor Python Instructions

See `dpctl/tensor/AGENTS.md` for patterns.

## Queue validation (required)
```python
exec_q = dpctl.utils.get_execution_queue([x.sycl_queue, y.sycl_queue])
if exec_q is None:
raise ExecutionPlacementError("...")
```

## Adding operations
See checklist in `dpctl/tensor/AGENTS.md`.
23 changes: 23 additions & 0 deletions .github/instructions/testing.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
applyTo:
- "dpctl/tests/**/*.py"
- "**/test_*.py"
---

# Testing Instructions

See `dpctl/tests/AGENTS.md` for patterns.

## Essential helpers (from `helper/_helper.py`)
```python
get_queue_or_skip() # Create queue or skip
skip_if_dtype_not_supported() # Skip if device lacks dtype
```

## Dtype/USM lists
Import from `elementwise/utils.py` - do not hardcode.

## Coverage
- All dtypes from `_all_dtypes`
- All USM types: device, shared, host
- Edge cases: empty, scalar, broadcast
59 changes: 59 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# AGENTS.md - AI Agent Guide for DPCTL

## Overview

**DPCTL** (Data Parallel Control) is a Python SYCL binding library for heterogeneous computing. It provides Python wrappers for SYCL runtime objects and implements the Python Array API standard for tensor operations.

- **License:** Apache 2.0 (see `LICENSE`)
- **Copyright:** Intel Corporation

## Architecture

```
Python API → Cython Bindings → C API → SYCL Runtime
dpctl/ _sycl_*.pyx libsyclinterface/

dpctl.tensor → pybind11 → C++ Kernels (libtensor/) → SYCL Runtime
```

## Directory Guide

| Directory | AGENTS.md | Purpose |
|-----------|-----------|---------|
| `dpctl/` | [dpctl/AGENTS.md](dpctl/AGENTS.md) | Core SYCL bindings (Device, Queue, Context) |
| `dpctl/tensor/` | [dpctl/tensor/AGENTS.md](dpctl/tensor/AGENTS.md) | Array API tensor operations |
| `dpctl/tensor/libtensor/` | [dpctl/tensor/libtensor/AGENTS.md](dpctl/tensor/libtensor/AGENTS.md) | C++ SYCL kernels |
| `dpctl/memory/` | [dpctl/memory/AGENTS.md](dpctl/memory/AGENTS.md) | USM memory management |
| `dpctl/program/` | [dpctl/program/AGENTS.md](dpctl/program/AGENTS.md) | SYCL kernel compilation |
| `dpctl/utils/` | [dpctl/utils/AGENTS.md](dpctl/utils/AGENTS.md) | Utility functions |
| `dpctl/tests/` | [dpctl/tests/AGENTS.md](dpctl/tests/AGENTS.md) | Test suite |
| `libsyclinterface/` | [libsyclinterface/AGENTS.md](libsyclinterface/AGENTS.md) | C API layer |

## Code Style

Configuration files (do not hardcode versions - check these files):
- **Python/Cython:** `.pre-commit-config.yaml`
- **C/C++:** `.clang-format`
- **Linting:** `.flake8`

## License Header

All source files require Apache 2.0 header with Intel copyright. Reference existing files for exact format.

## Quick Reference

```python
import dpctl
import dpctl.tensor as dpt

q = dpctl.SyclQueue("gpu") # Create queue
x = dpt.ones((100, 100), dtype="f4", sycl_queue=q) # Create array
np_array = dpt.asnumpy(x) # Transfer to host
```

## Key Concepts

- **Queue:** Execution context binding device + context
- **USM:** Unified Shared Memory (device/shared/host types)
- **Filter string:** Device selector syntax `"backend:device_type:num"`
- **Array API:** Python standard for array operations (https://data-apis.org/array-api/)
62 changes: 62 additions & 0 deletions dpctl/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# dpctl/ - Core SYCL Bindings

## Purpose

Python/Cython wrappers for SYCL runtime objects: Device, Queue, Context, Event, Platform.

## Key Files

| File | Purpose |
|------|---------|
| `_sycl_device.pyx` | `SyclDevice` wrapping `sycl::device` |
| `_sycl_queue.pyx` | `SyclQueue` wrapping `sycl::queue` |
| `_sycl_context.pyx` | `SyclContext` wrapping `sycl::context` |
| `_sycl_event.pyx` | `SyclEvent` wrapping `sycl::event` |
| `_sycl_platform.pyx` | `SyclPlatform` wrapping `sycl::platform` |
| `_sycl_device_factory.pyx` | Device enumeration and selection |
| `_sycl_queue_manager.pyx` | Queue management utilities |
| `_backend.pxd` | C API declarations from libsyclinterface |
| `enum_types.py` | Python enums for SYCL types |

## Cython Conventions

### Required Directives (after license header)
```cython
# distutils: language = c++
# cython: language_level=3
# cython: linetrace=True
```

### Extension Type Pattern
```cython
cdef class SyclDevice:
cdef DPCTLSyclDeviceRef _device_ref # C reference

def __dealloc__(self):
if self._device_ref is not NULL:
DPCTLDevice_Delete(self._device_ref)

cdef DPCTLSyclDeviceRef get_device_ref(self):
return self._device_ref
```

### Key Rules
- Store C references as `_*_ref` attributes
- Always clean up in `__dealloc__` with NULL check
- Use `with nogil:` for blocking C calls
- Check NULL before using C API returns

### Exceptions
- `SyclDeviceCreationError`
- `SyclQueueCreationError`
- `SyclContextCreationError`

## cimport vs import

```cython
# cimport - C-level declarations (compile-time)
from ._backend cimport DPCTLSyclDeviceRef, DPCTLDevice_Create

# import - Python-level (runtime)
from . import _device_selection
```
41 changes: 41 additions & 0 deletions dpctl/memory/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# dpctl/memory/ - USM Memory Management

## Purpose

Python classes for SYCL Unified Shared Memory (USM) allocation.

## USM Types

| Class | USM Type | Description |
|-------|----------|-------------|
| `MemoryUSMDevice` | Device | Device-only, fastest access |
| `MemoryUSMShared` | Shared | Host and device accessible |
| `MemoryUSMHost` | Host | Host memory, device accessible |

## __sycl_usm_array_interface__

All memory classes implement this protocol:
```python
{
"data": (ptr, readonly_flag),
"shape": (nbytes,),
"strides": None,
"typestr": "|u1",
"version": 1,
"syclobj": queue
}
```

## Memory Lifetime Rules

1. **Queue-bound:** Memory tied to specific queue/context
2. **Outlive operations:** Keep memory alive until operations complete
3. **Views extend lifetime:** Views keep base memory alive

## Key Files

| File | Purpose |
|------|---------|
| `_memory.pyx` | Memory class implementations |
| `_memory.pxd` | Cython declarations |
| `__init__.py` | Public API exports |
39 changes: 39 additions & 0 deletions dpctl/program/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# dpctl/program/ - SYCL Kernel Compilation

## Purpose

Compile and manage SYCL kernels from OpenCL C or SPIR-V source.

## Key Files

| File | Purpose |
|------|---------|
| `_program.pyx` | `SyclProgram`, `SyclKernel` extension types |
| `_program.pxd` | Cython declarations |
| `__init__.py` | Public API exports |

## Classes

- **`SyclProgram`** - Compiled SYCL program containing one or more kernels
- **`SyclKernel`** - Individual kernel extracted from a program

## Usage Pattern

```python
from dpctl.program import create_program_from_source

source = """
__kernel void add(__global float* a, __global float* b, __global float* c) {
int i = get_global_id(0);
c[i] = a[i] + b[i];
}
"""

program = create_program_from_source(queue, source)
kernel = program.get_sycl_kernel("add")
```

## Notes

- Programs are context-bound
- Follows same Cython patterns as core dpctl (see `../AGENTS.md`)
Loading
Loading