fal package

Subpackages

Submodules

fal.app module

class fal.app.App(*, _allow_init=False)

Bases: BaseServable

Create a fal serverless application.

Subclass this to define your application with custom setup, endpoints, and configuration. The App class handles model loading, request routing, and lifecycle management.

Example

>>> class TextToImage(fal.App, machine_type="GPU"):
...     requirements = ["diffusers", "torch"]
...
...     def setup(self):
...         self.pipe = StableDiffusionPipeline.from_pretrained(
...             "runwayml/stable-diffusion-v1-5"
...         )
...
...     @fal.endpoint("/")
...     def generate(self, prompt: str) -> dict:
...         image = self.pipe(prompt).images[0]
...         return {"url": fal.toolkit.upload_image(image)}
requirements

Pip packages to install in the environment. Supports standard pip syntax including version specifiers. Use a list of strings for a single install step, or a list of lists to install in multiple steps. Example: [“numpy==1.24.0”, “torch>=2.0.0”] or [[“setuptools”, “wheel”], [“numpy==1.24.0”]]

local_python_modules

List of local Python module names to include in the deployment. Use for custom code not available on PyPI. Example: [“my_utils”, “models”]

machine_type

Compute instance type for your application. CPU options: ‘XS’, ‘S’ (default), ‘M’, ‘L’. GPU options: ‘GPU-A6000’, ‘GPU-A100’, ‘GPU-H100’, ‘GPU-H200’, ‘GPU-B200’. Use a string for a single type, or a list to define fallback types (tried in order until one is available). Example: “GPU-A100” or [“GPU-H100”, “GPU-A100”]

num_gpus

Number of GPUs to allocate. Only applies to GPU machine types.

regions

Allowed regions for deployment. None means any region. Example: [“us-east”, “eu-west”]

host_kwargs

Advanced configuration dictionary passed to the host. For internal use. Prefer using class attributes instead.

app_name

Custom name for the application. Defaults to class name.

app_auth

Authentication mode. Options: ‘private’ (API key required), ‘public’ (no auth), ‘shared’ (shareable link).

app_files

List of files/directories to include in deployment. Example: [“./models”, “./config.yaml”]

app_files_ignore

Regex patterns to exclude from deployment. Default excludes .pyc, __pycache__, .git, .DS_Store.

app_files_context_dir

Base directory for resolving app_files paths. Defaults to the directory containing the app file.

request_timeout

Maximum seconds for a single request. None for default.

startup_timeout

Maximum seconds for app startup/setup. None for default.

min_concurrency

Minimum warm instances to keep running. Set to 1+ to avoid cold starts. Default is 0 (scale to zero).

max_concurrency

Maximum instances to scale up to.

concurrency_buffer

Additional instances to keep warm above current load.

concurrency_buffer_perc

Percentage buffer of instances above current load.

scaling_delay

Seconds to wait for a request to be picked up by a runner before triggering a scale up. Useful for apps with slow startup times.

max_multiplexing

Maximum concurrent requests per instance.

kind

Deployment kind. For internal use.

image

Custom container image for the application. Use ContainerImage to specify a Dockerfile.

app_auth: ClassVar[Optional[Literal['public', 'private', 'shared']]] = None
app_files: ClassVar[list[str]] = []
app_files_context_dir: ClassVar[Optional[str]] = None
app_files_ignore: ClassVar[list[str]] = ['\\.pyc$', '__pycache__/', '\\.git/', '\\.DS_Store$']
app_name: ClassVar[Optional[str]] = None
collect_routes()
Return type:

dict[RouteSignature, Callable[..., Any]]

concurrency_buffer: ClassVar[int | None] = None
concurrency_buffer_perc: ClassVar[int | None] = None
property current_request: RequestContext | None
classmethod get_endpoints()
Return type:

list[str]

classmethod get_health_check_config()
Return type:

Optional[ApplicationHealthCheckConfig]

handle_exit()

Handle exit signal.

health()
host_kwargs: ClassVar[dict[str, Any]] = {'_scheduler': 'nomad', '_scheduler_options': {'storage_region': 'us-east'}, 'keep_alive': 60, 'resolver': 'uv'}
image: ClassVar[Optional[ContainerImage]] = None
isolate_channel: Channel | None = None
kind: ClassVar[Optional[str]] = None
lifespan(app)
local_file_path: ClassVar[Optional[str]] = None
local_python_modules: ClassVar[list[str]] = []
machine_type: ClassVar[str | list[str]] = 'S'
max_concurrency: ClassVar[int | None] = None
max_multiplexing: ClassVar[int | None] = None
min_concurrency: ClassVar[int | None] = None
num_gpus: ClassVar[int | None] = None
provide_hints()

Provide hints for routing the application.

Return type:

list[str]

regions: ClassVar[Optional[list[str]]] = None
request_timeout: ClassVar[int | None] = None
requirements: ClassVar[list[str] | list[list[str]]] = []
classmethod run_local(*args, **kwargs)
scaling_delay: ClassVar[int | None] = None
setup()

Setup the application before serving.

skip_retry_conditions: ClassVar[Optional[list[Literal['timeout', 'server_error', 'connection_error']]]] = None
classmethod spawn()
Return type:

AppSpawnInfo

startup_timeout: ClassVar[int | None] = None
teardown()

Teardown the application after serving.

termination_grace_period_seconds: ClassVar[int | None] = None
class fal.app.AppClient(cls, url, timeout=None)

Bases: object

classmethod connect(cls, app_cls, *, health_request_timeout=30, startup_timeout=60, health_check_interval=0.5)
health()
exception fal.app.AppClientError(message, status_code, headers=<factory>)

Bases: FalServerlessException

headers: dict[str, str]
message: str
status_code: int
class fal.app.AppSpawnInfo(info)

Bases: object

property application
property future
property logs
property stream
property url
wait(*, health_request_timeout=30, startup_timeout=60, health_check_interval=0.5, headers=None)
Return type:

None

class fal.app.EndpointClient(url, endpoint, signature, timeout=None, headers=None)

Bases: object

class fal.app.RequestContext(request_id, endpoint, lifecycle_preference, headers)

Bases: object

endpoint: str | None
headers: Header
lifecycle_preference: dict[str, str] | None
request_id: str | None
fal.app.endpoint(path, *, is_websocket=False, health_check=None)

Designate the decorated function as an application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

async fal.app.open_isolate_channel(address)
Return type:

Channel | None

fal.app.wrap_app(cls, **kwargs)
Return type:

IsolatedFunction

fal.apps module

class fal.apps.Completed(logs)

Bases: _Status

Indicates the request has been completed successfully and the result is ready to be retrieved.

logs: list[dict[str, Any]] | None
class fal.apps.InProgress(logs)

Bases: _Status

Indicates the request is now being actively processed, and provides runtime logs for the inference task.

logs: list[dict[str, Any]] | None
class fal.apps.Queued(position)

Bases: _Status

Indicates the request is still in the queue, and provides the position in the queue for ETA calculation.

position: int
class fal.apps.RequestHandle(app_id, request_id, _client=<factory>, _creds=<factory>)

Bases: object

A handle to an async inference request.

app_id: str
cancel()

Cancel an async inference request.

Return type:

None

fetch_raw_response()
Return type:

Response

fetch_result()

Retrieve the result of an async inference request, raises an exception if the request is not completed yet.

Return type:

dict[str, Any]

get()

Retrieve the result of an async inference request, polling the status of the request until it is completed.

Return type:

dict[str, Any]

iter_events(*, logs=False, _RequestHandle__poll_delay=0.2)

Yield all events regarding the given task till its completed.

Return type:

Iterator[_Status]

request_id: str
status(*, logs=False)

Check the status of an async inference request.

Return type:

_Status

fal.apps.run(app_id, arguments, *, path='')

Run an inference task on a Fal app and return the result.

Return type:

dict[str, Any]

fal.apps.stream(app_id, arguments, *, path='')

Stream an inference task on a Fal app.

Return type:

Iterator[str | bytes]

fal.apps.submit(app_id, arguments, *, path='')

Submit an async inference task to the app. Returns a request handle which can be used to check the status of the request and retrieve the result.

Return type:

RequestHandle

fal.apps.ws(app_id, *, path='')

Connect to a HTTP endpoint but with websocket protocol. This is an internal and experimental API, use it at your own risk.

Return type:

Iterator[_WSConnection]

fal.compat module

async fal.compat.run_in_thread(func, *args, **kwargs)

Run sync code on a worker thread with Python 3.8+ support.

Return type:

Any

fal.config module

class fal.config.Config(*, validate_profile=False, profile=None)

Bases: object

DEFAULT_CONFIG_PATH = '~/.fal/config.toml'
delete_profile(profile)
Return type:

None

edit()
Return type:

Iterator[Config]

get(key)
Return type:

Optional[str]

get_internal(key)
Return type:

Optional[str]

property profile: str | None
profiles()
Return type:

List[str]

save()
Return type:

None

set(key, value)
Return type:

None

set_internal(key, value)
Return type:

None

unset(key)
Return type:

None

unset_internal(key)
Return type:

None

fal.container module

class fal.container.ContainerImage(dockerfile_str, build_args=<factory>, registries=<factory>, builder=None, compression='gzip', force_compression=False, secrets=<factory>, context_dir=PosixPath('/home/runner/work/fal/fal/projects/fal'), dockerignore=None, dockerignore_path=None)

Bases: object

ContainerImage represents a Docker image that can be built from a Dockerfile.

add_dockerignore(patterns=None, path=None)

Add or update dockerignore patterns.

Sets the internal dockerignore patterns using gitignore-style matching. You can provide either a list of patterns or a path to a .dockerignore file.

Parameters:
  • patterns (Optional[List[str]]) – List of gitignore-style patterns

  • path (Optional[PathLike]) – Path to a .dockerignore file

Raises:

ValueError – If both patterns and path are provided, or neither

Return type:

None

build_args: Dict[str, str]
builder: Optional[Literal['depot', 'service', 'worker']] = None
compression: str = 'gzip'
context_dir: PathLike = PosixPath('/home/runner/work/fal/fal/projects/fal')
dockerfile_str: str
dockerignore: Optional[List[str]] = None
dockerignore_path: Optional[PathLike] = None
force_compression: bool = False
classmethod from_dockerfile(path, **kwargs)
Return type:

ContainerImage

classmethod from_dockerfile_str(text, **kwargs)
Return type:

ContainerImage

get_copy_add_sources()

Get list of src paths/patterns from COPY/ADD commands. This method only parses the Dockerfile - it doesn’t access the filesystem.

Return type:

List[str]

Returns:

List of src paths (e.g., [“src/”, “requirements.txt”, “*.py”]) that can be passed to FileSync.sync_files(). Returns empty list if no COPY/ADD commands found.

registries: Dict[str, Dict[str, str]]
secrets: Dict[str, str]
to_dict()
Return type:

dict

class fal.container.DockerfileParser(content)

Bases: object

content: str
normalized_content: str
parse_copy_add_sources()
Parse COPY and ADD commands to extract source paths.
  • Skips COPY –from=… (multi-stage builds)

  • Skips ADD with URLs (http://, https://)

  • Normalizes absolute paths by stripping leading slash (Docker treats them as relative to the build context)

  • Handles both shell form and JSON form

Return type:

List[str]

Returns:

List of source paths/patterns referenced in COPY/ADD commands.

class fal.container.DockerignoreHandler(context_dir=None, dockerignore=None, dockerignore_path=None)

Bases: object

context_dir: Optional[PathLike] = None
dockerignore: Optional[List[str]] = None
dockerignore_path: Optional[PathLike] = None
get_patterns()

Get list of ignore patterns.

Priority (highest to lowest): 1. Explicit dockerignore list 2. Explicit path to the .dockerignore file 3. .dockerignore file in the context directory 4. Default ignore patterns

Return type:

List[str]

Returns:

List of ignore patterns

get_regex_patterns()
Return type:

List[str]

fal.file_sync module

class fal.file_sync.FileMetadata(size, mtime, mode, hash, relative_path, absolute_path)

Bases: object

absolute_path: str
classmethod from_path(file_path, *, relative, absolute)
Return type:

FileMetadata

hash: str
mode: int
mtime: float
relative_path: str
size: int
to_dict()
Return type:

Dict[str, str]

class fal.file_sync.FileSync(local_file_path)

Bases: object

check_hashes_on_server(hashes)
Return type:

List[str]

close()
collect_files(paths, files_context_dir=None)
sync_files(paths, chunk_size=10485760, max_concurrency_uploads=10, files_ignore=[], files_context_dir=None)
Return type:

Tuple[List[FileMetadata], List[AppFileUploadException]]

upload_file_multipart(file_path, metadata, chunk_size=10485760)
Return type:

str

class fal.file_sync.FileSyncOptions(files_list, files_ignore, files_context_dir)

Bases: object

files_context_dir: Optional[str]
files_ignore: List[Pattern]
files_list: List[str]
classmethod from_options(options)
Return type:

FileSyncOptions

fal.file_sync.compute_hash(file_path, mode)
Return type:

str

fal.file_sync.normalize_path(path_str, base_path_str, files_context_dir=None)
Return type:

Tuple[str, str]

fal.file_sync.print_path_tree(file_paths)
fal.file_sync.sanitize_relative_path(rel_path, original_path)
Return type:

str

fal.files module

class fal.files.FalFileSystem(*, host=None, team=None, profile=None, **kwargs)

Bases: AbstractFileSystem

get_file(rpath, lpath, **kwargs)

Copy single remote file to local

info(path, **kwargs)

Give details of entry at path

Returns a single dictionary, with exactly the same information as ls would with detail=True.

The default implementation calls ls and could be overridden by a shortcut. kwargs are passed on to `ls().

Some file systems might not be able to measure the file’s size, in which case, the returned dict will include 'size': None.

Returns:

  • dict with keys (name (full path in the FS), size (in bytes), type (file,)

  • directory, or something else) and other FS-specific keys.

ls(path, detail=True, **kwargs)

List objects at path.

This should include subdirectories and files at that location. The difference between a file and a directory must be clear when details are requested.

The specific keys, or perhaps a FileInfo class, or similar, is TBD, but must be consistent across implementations. Must include:

  • full path to the entry (without protocol)

  • size of the entry, in bytes. If the value cannot be determined, will be None.

  • type of entry, “file”, “directory” or other

Additional information may be present, appropriate to the file-system, e.g., generation, checksum, etc.

May use refresh=True|False to allow use of self._ls_from_cache to check for a saved listing and avoid calling the backend. This would be common where listing may be expensive.

Parameters:
  • path (str)

  • detail (bool) – if True, gives a list of dictionaries, where each is the same as the result of info(path). If False, gives a list of paths (str).

  • kwargs (may have additional backend-specific options, such as version) – information

Returns:

  • List of strings if detail is False, or list of directory information

  • dicts if detail is True.

mv(path1, path2, recursive=False, maxdepth=None, **kwargs)

Move file(s) from one location to another

put_file(lpath, rpath, mode='overwrite', **kwargs)

Copy single file to remote

put_file_from_url(url, rpath, mode='overwrite', **kwargs)
rename(path, destination, **kwargs)

Alias of AbstractFileSystem.mv.

rm(path, **kwargs)

Delete files.

Parameters:
  • path (str or list of str) – File(s) to delete.

  • recursive (bool) – If file(s) are directories, recursively delete contents and then also remove the directory

  • maxdepth (int or None) – Depth to pass to walk for finding files to delete, if recursive. If None, there will be no limit and infinite recursion may be possible.

fal.flags module

fal.flags.bool_envvar(name)

fal.project module

fal.project.find_project_root(srcs)

Return a directory containing .git, or pyproject.toml.

That directory will be a common parent of all files and directories passed in srcs.

If no directory in the tree contains a marker that would specify it’s the project root, the root of the file system is returned.

Returns a two-tuple with the first element as the project root path and the second element as a string describing the method by which the project root was discovered.

Return type:

Tuple[Path, str]

fal.project.find_pyproject_toml(path_search_start=None)

Find the absolute filepath to a pyproject.toml if it exists

Return type:

Optional[str]

fal.project.parse_pyproject_toml(path_config)

Parse a pyproject toml file, pulling out relevant parts for fal.

If parsing fails, will raise a tomli.TOMLDecodeError.

Return type:

Dict[str, Any]

fal.realtime module

fal.realtime.msgpack_decode_message(message)
Return type:

Any

fal.realtime.msgpack_encode_message(message)
Return type:

bytes

fal.realtime.realtime(path, *, buffering=None, session_timeout=None, input_modal=<object object>, output_modal=<object object>, max_batch_size=1, content_type='application/msgpack', encode_message=None, decode_message=None)

Designate the decorated function as a realtime application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

fal.ref module

fal.ref.get_current_app()
Return type:

Optional[App]

fal.ref.set_current_app(app)

fal.sdk module

class fal.sdk.AliasInfo(alias, revision, auth_mode, keep_alive, max_concurrency, max_multiplexing, active_runners, min_concurrency, concurrency_buffer, concurrency_buffer_perc, scaling_delay, machine_types, request_timeout, startup_timeout, valid_regions, environment_name=None)

Bases: object

active_runners: int
alias: str
auth_mode: str
concurrency_buffer: int
concurrency_buffer_perc: int
environment_name: str | None = None
keep_alive: int
machine_types: list[str]
max_concurrency: int
max_multiplexing: int
min_concurrency: int
request_timeout: int
revision: str
scaling_delay: int
startup_timeout: int
valid_regions: list[str]
class fal.sdk.ApplicationHealthCheckConfig(path, start_period_seconds, timeout_seconds, failure_threshold, call_regularly)

Bases: object

call_regularly: Optional[bool]
failure_threshold: Optional[int]
path: str
start_period_seconds: Optional[int]
timeout_seconds: Optional[int]
class fal.sdk.ApplicationInfo(application_id, keep_alive, max_concurrency, max_multiplexing, active_runners, min_concurrency, concurrency_buffer, concurrency_buffer_perc, scaling_delay, machine_types, request_timeout, startup_timeout, valid_regions, created_at, environment_name=None)

Bases: object

active_runners: int
application_id: str
concurrency_buffer: int
concurrency_buffer_perc: int
created_at: datetime
environment_name: str | None = None
keep_alive: int
machine_types: list[str]
max_concurrency: int
max_multiplexing: int
min_concurrency: int
request_timeout: int
scaling_delay: int
startup_timeout: int
valid_regions: list[str]
class fal.sdk.AuthenticatedCredentials(user=<factory>, team=None)

Bases: Credentials

team: str | None = None
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

user: UserAccess
class fal.sdk.Credentials

Bases: object

server_credentials: ServerCredentials = <fal.sdk.RemoteCredentials object>
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

class fal.sdk.DeploymentStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

RECREATE = 'recreate'
ROLLING = 'rolling'
static from_proto(proto)
Return type:

DeploymentStrategy

to_proto()
Return type:

int

class fal.sdk.EnvironmentInfo(name, description, is_default, created_at)

Bases: object

created_at: datetime
description: str | None
is_default: bool
name: str
class fal.sdk.FalServerlessClient(hostname, credentials=<factory>)

Bases: object

connect()
Return type:

FalServerlessConnection

credentials: Credentials
hostname: str
class fal.sdk.FalServerlessConnection(hostname, credentials, _stack=<factory>, _stub=None)

Bases: object

close()
create_alias(alias, revision, auth_mode, *, environment_name=None)
Return type:

AliasInfo

create_environment(name, description=None)
Return type:

EnvironmentInfo

create_user_key(scope, alias)
Return type:

tuple[str, str]

credentials: Credentials
define_environment(kind, force=False, **options)
Return type:

EnvironmentDefinition

delete_alias(alias, *, environment_name=None)
Return type:

str | None

delete_application(application_id)
Return type:

None

delete_environment(name)
Return type:

None

delete_secret(name, *, environment_name=None)
Return type:

None

hostname: str
kill_runner(runner_id)
Return type:

None

list_alias_runners(alias, *, list_pending=True, start_time=None, environment_name=None)
Return type:

list[RunnerInfo]

list_aliases(*, environment_name=None)
Return type:

list[AliasInfo]

list_applications(application_name=None, *, environment_name=None)
Return type:

list[ApplicationInfo]

list_environments()
Return type:

list[EnvironmentInfo]

list_runners(start_time=None)
Return type:

list[RunnerInfo]

list_secrets(*, environment_name=None)
Return type:

list[ServerlessSecret]

list_user_keys()
Return type:

list[UserKeyInfo]

register(function, environments, application_name=None, auth_mode=None, *, source_code=None, health_check_config=None, serialization_method='cloudpickle', machine_requirements=None, metadata=None, deployment_strategy, scale=True, private_logs=False, files=None, skip_retry_conditions=None, environment_name=None, termination_grace_period_seconds=None)
Return type:

Iterator[RegisterApplicationResult]

revoke_user_key(key_id)
Return type:

None

rollout_application(application_name, force=False, *, environment_name=None)
Return type:

None

run(function, environments, *, serialization_method='cloudpickle', machine_requirements=None, setup_function=None, files=None, application_name=None, auth_mode=None, environment_name=None)
Return type:

Iterator[HostedRunResult[TypeVar(ResultT)]]

scale(application_name, max_concurrency=None)
Return type:

None

set_secret(name, value, *, environment_name=None)
Return type:

None

stop_runner(runner_id, replace_first=False)
Return type:

None

property stub: IsolateControllerStub
update_application(application_name, keep_alive=None, max_multiplexing=None, max_concurrency=None, min_concurrency=None, concurrency_buffer=None, concurrency_buffer_perc=None, scaling_delay=None, request_timeout=None, startup_timeout=None, valid_regions=None, machine_types=None, *, environment_name=None)
Return type:

AliasInfo

class fal.sdk.FalServerlessKeyCredentials(key_id, key_secret)

Bases: Credentials

key_id: str
key_secret: str
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

class fal.sdk.File(hash, relative_path)

Bases: object

hash: str
relative_path: str
class fal.sdk.HealthCheck(*, start_period_seconds=None, timeout_seconds=None, failure_threshold=None, call_regularly=None)

Bases: object

call_regularly: Optional[bool] = None
failure_threshold: Optional[int] = None
start_period_seconds: Optional[int] = None
timeout_seconds: Optional[int] = None
class fal.sdk.HostedRunResult(run_id, status, logs=<factory>, result=None, stream=None, service_urls=None)

Bases: Generic[ResultT]

logs: list[Log]
result: Optional[TypeVar(ResultT)] = None
run_id: str
service_urls: ServiceURLs | None = None
status: HostedRunStatus
stream: Any = None
class fal.sdk.HostedRunState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

INTERNAL_FAILURE = 2
IN_PROGRESS = 0
SUCCESS = 1
class fal.sdk.HostedRunStatus(state)

Bases: object

state: HostedRunState
class fal.sdk.KeyScope(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

ADMIN = 'ADMIN'
API = 'API'
static from_proto(proto)
Return type:

KeyScope

class fal.sdk.LocalCredentials

Bases: ServerCredentials

to_grpc()
Return type:

ChannelCredentials

class fal.sdk.MachineRequirements(machine_types, num_gpus=None, keep_alive=10, base_image=None, exposed_port=None, scheduler=None, scheduler_options=None, max_concurrency=None, max_multiplexing=None, min_concurrency=None, concurrency_buffer=None, concurrency_buffer_perc=None, scaling_delay=None, request_timeout=None, startup_timeout=None, valid_regions=None)

Bases: object

base_image: str | None = None
concurrency_buffer: int | None = None
concurrency_buffer_perc: int | None = None
exposed_port: int | None = None
keep_alive: int = 10
machine_types: list[str]
max_concurrency: int | None = None
max_multiplexing: int | None = None
min_concurrency: int | None = None
num_gpus: int | None = None
request_timeout: int | None = None
scaling_delay: int | None = None
scheduler: str | None = None
scheduler_options: dict[str, Any] | None = None
startup_timeout: int | None = None
valid_regions: list[str] | None = None
class fal.sdk.RegisterApplicationResult(result, logs=<factory>, service_urls=None)

Bases: object

logs: list[Log]
result: RegisterApplicationResultType | None
service_urls: ServiceURLs | None = None
class fal.sdk.RegisterApplicationResultType(application_id)

Bases: object

application_id: str
class fal.sdk.RemoteCredentials

Bases: ServerCredentials

to_grpc()
Return type:

ChannelCredentials

class fal.sdk.ReplaceState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

DID_REPLACE = 'DID_REPLACE'
NO_REPLACE = 'NO_REPLACE'
WILL_REPLACE = 'WILL_REPLACE'
class fal.sdk.RunnerInfo(runner_id, in_flight_requests, expiration_countdown, uptime, external_metadata, revision, alias, state, machine_type, replacement=ReplaceState.NO_REPLACE)

Bases: object

alias: str
expiration_countdown: Optional[int]
external_metadata: dict[str, Any]
in_flight_requests: int
machine_type: str
replacement: ReplaceState = 'NO_REPLACE'
revision: str
runner_id: str
state: RunnerState
uptime: timedelta
class fal.sdk.RunnerState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

DEAD = 'DEAD'
DOCKER_PULL = 'DOCKER_PULL'
DRAINING = 'DRAINING'
FAILURE_DELAY = 'FAILURE_DELAY'
IDLE = 'IDLE'
PENDING = 'PENDING'
RUNNING = 'RUNNING'
SETUP = 'SETUP'
TERMINATED = 'TERMINATED'
TERMINATING = 'TERMINATING'
class fal.sdk.ServerCredentials

Bases: object

property base_options: dict[str, str | int]
to_grpc()
Return type:

ChannelCredentials

class fal.sdk.ServerlessSecret(name, created_at, environment_name=None)

Bases: object

created_at: datetime
environment_name: str | None = None
name: str
class fal.sdk.ServiceURLs(playground, run, queue, ws, log)

Bases: object

log: str
playground: str
queue: str
run: str
ws: str
class fal.sdk.UserKeyInfo(key_id, created_at, scope, alias)

Bases: object

alias: str
created_at: datetime
key_id: str
scope: KeyScope
class fal.sdk.WorkerStatus(worker_id, start_time, end_time, duration, user_id, machine_type)

Bases: object

duration: timedelta
end_time: datetime
machine_type: str
start_time: datetime
user_id: str
worker_id: str
fal.sdk.construct_alias(base_name, environment_name=None)

Construct the full alias with environment suffix.

Examples: - (“my-app”, None) → “my-app” - (“my-app”, “main”) → “my-app” - (“my-app”, “staging”) → “my-app–staging”

Return type:

str

fal.sdk.deconstruct_alias(full_alias, environment_name=None)

Extract base name from full alias for display.

Examples: - (“my-app–staging”, “staging”) → “my-app” - (“my-app”, “main”) → “my-app” - (“my-app”, None) → “my-app”

Return type:

str

fal.sdk.get_agent_credentials(original_credentials)

If running inside a fal Serverless box, use the preconfigured credentials instead of the user provided ones.

Return type:

Credentials

fal.sdk.get_credentials(team=None, key=None, profile=None)
Return type:

Credentials

fal.sdk.get_default_server_credentials()
Return type:

ServerCredentials

fal.sync module

fal.sync.sync_dir(local_dir, remote_dir, force_upload=False)
Return type:

str

fal.upload module

class fal.upload.AppFileMultipartUpload(client, file_hash, metadata, chunk_size=10485760, max_concurrency=10)

Bases: BaseMultipartUpload

property cancel_url: str | None
property complete_url: str
get_complete_payload(parts)
Return type:

dict

get_initiate_payload()
Return type:

Optional[dict]

property initiate_url: str
property part_url: str
class fal.upload.BaseMultipartUpload(client, chunk_size=10485760, max_concurrency=10)

Bases: object

cancel()
Return type:

None

property cancel_url: str | None
complete()
Return type:

str

property complete_url: str
get_complete_payload(parts)
Return type:

dict

get_initiate_payload()
Return type:

Optional[dict]

initiate()
Return type:

str

property initiate_url: str
property part_url: str
upload_file(file_path, on_part_complete=None)
Return type:

str

property upload_id: str
class fal.upload.DataFileMultipartUpload(client, target_path, chunk_size=10485760, max_concurrency=10)

Bases: BaseMultipartUpload

property cancel_url: str | None
property complete_url: str
property initiate_url: str
property part_url: str

fal.utils module

class fal.utils.LoadedFunction(function, endpoints, app_name, app_auth, source_code, class_name=None)

Bases: object

app_auth: Optional[Literal['public', 'private', 'shared']]
app_name: str | None
class_name: str | None = None
endpoints: list[str]
function: IsolatedFunction
source_code: str | None
fal.utils.load_function_from(host, file_path, function_name=None, *, force_env_build=False, options=None, app_name=None, app_auth=None, limit_max_requests=None)
Return type:

LoadedFunction

fal.workflows module

class fal.workflows.AttributeLeaf(leaf, attribute)

Bases: Leaf

attribute: str
execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

leaf: Leaf
property referee: ReferenceLeaf
class fal.workflows.Context(vars)

Bases: object

hydrate(input)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

vars: dict[str, Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]
class fal.workflows.Display(id, depends, fields)

Bases: Node

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

fields: list[Leaf]
classmethod from_json(data)
Return type:

Display

to_json()
Return type:

dict[str, Any]

class fal.workflows.IndexLeaf(leaf, index)

Bases: Leaf

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

index: int
leaf: Leaf
property referee: ReferenceLeaf
class fal.workflows.Leaf

Bases: object

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

property referee: ReferenceLeaf
exception fal.workflows.MisconfiguredGraphError

Bases: WorkflowSyntaxError

class fal.workflows.Node(id, depends)

Bases: object

depends: set[str]
execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

classmethod from_json(data)
Return type:

Node

id: str
to_json()
Return type:

dict[str, Any]

class fal.workflows.ReferenceLeaf(id)

Bases: Leaf

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

id: str
property referee: ReferenceLeaf
class fal.workflows.Run(id, depends, app, input)

Bases: Node

app: str
execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

classmethod from_json(data)
Return type:

Run

input: Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]
to_json()
Return type:

dict[str, Any]

class fal.workflows.Workflow(name, input_schema, output_schema, nodes=<factory>, output=None, _app_counter=<factory>)

Bases: object

display(*fields)
Return type:

None

execute(input)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

classmethod from_json(data)
Return type:

Workflow

property input: ReferenceLeaf
input_schema: Dict[str, Any]
name: str
nodes: dict[str, Node]
output: dict[str, Any] | None = None
output_schema: Dict[str, Any]
publish(title, *, is_public=True)
run(app, input)
Return type:

ReferenceLeaf

set_output(output)
Return type:

None

to_dict()
Return type:

dict[str, Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]

to_json()
Return type:

dict[str, Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]

exception fal.workflows.WorkflowSyntaxError

Bases: FalServerlessException

fal.workflows.create_workflow(name, input, output)
Return type:

Workflow

fal.workflows.depends(data)
Return type:

set[str]

fal.workflows.export_workflow_json(data)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

fal.workflows.import_workflow_json(data)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

fal.workflows.iter_leaves(data)
Return type:

Iterator[Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]

fal.workflows.main()
Return type:

None

fal.workflows.parse_leaf(raw_leaf)

Parses a leaf (which is in the form of $variable.field.field_2[index] etc.) into a tree of Leaf objects.

Return type:

Leaf

Module contents

class fal.App(*, _allow_init=False)

Bases: BaseServable

Create a fal serverless application.

Subclass this to define your application with custom setup, endpoints, and configuration. The App class handles model loading, request routing, and lifecycle management.

Example

>>> class TextToImage(fal.App, machine_type="GPU"):
...     requirements = ["diffusers", "torch"]
...
...     def setup(self):
...         self.pipe = StableDiffusionPipeline.from_pretrained(
...             "runwayml/stable-diffusion-v1-5"
...         )
...
...     @fal.endpoint("/")
...     def generate(self, prompt: str) -> dict:
...         image = self.pipe(prompt).images[0]
...         return {"url": fal.toolkit.upload_image(image)}
requirements

Pip packages to install in the environment. Supports standard pip syntax including version specifiers. Use a list of strings for a single install step, or a list of lists to install in multiple steps. Example: [“numpy==1.24.0”, “torch>=2.0.0”] or [[“setuptools”, “wheel”], [“numpy==1.24.0”]]

local_python_modules

List of local Python module names to include in the deployment. Use for custom code not available on PyPI. Example: [“my_utils”, “models”]

machine_type

Compute instance type for your application. CPU options: ‘XS’, ‘S’ (default), ‘M’, ‘L’. GPU options: ‘GPU-A6000’, ‘GPU-A100’, ‘GPU-H100’, ‘GPU-H200’, ‘GPU-B200’. Use a string for a single type, or a list to define fallback types (tried in order until one is available). Example: “GPU-A100” or [“GPU-H100”, “GPU-A100”]

num_gpus

Number of GPUs to allocate. Only applies to GPU machine types.

regions

Allowed regions for deployment. None means any region. Example: [“us-east”, “eu-west”]

host_kwargs

Advanced configuration dictionary passed to the host. For internal use. Prefer using class attributes instead.

app_name

Custom name for the application. Defaults to class name.

app_auth

Authentication mode. Options: ‘private’ (API key required), ‘public’ (no auth), ‘shared’ (shareable link).

app_files

List of files/directories to include in deployment. Example: [“./models”, “./config.yaml”]

app_files_ignore

Regex patterns to exclude from deployment. Default excludes .pyc, __pycache__, .git, .DS_Store.

app_files_context_dir

Base directory for resolving app_files paths. Defaults to the directory containing the app file.

request_timeout

Maximum seconds for a single request. None for default.

startup_timeout

Maximum seconds for app startup/setup. None for default.

min_concurrency

Minimum warm instances to keep running. Set to 1+ to avoid cold starts. Default is 0 (scale to zero).

max_concurrency

Maximum instances to scale up to.

concurrency_buffer

Additional instances to keep warm above current load.

concurrency_buffer_perc

Percentage buffer of instances above current load.

scaling_delay

Seconds to wait for a request to be picked up by a runner before triggering a scale up. Useful for apps with slow startup times.

max_multiplexing

Maximum concurrent requests per instance.

kind

Deployment kind. For internal use.

image

Custom container image for the application. Use ContainerImage to specify a Dockerfile.

app_auth: ClassVar[Optional[Literal['public', 'private', 'shared']]] = None
app_files: ClassVar[list[str]] = []
app_files_context_dir: ClassVar[Optional[str]] = None
app_files_ignore: ClassVar[list[str]] = ['\\.pyc$', '__pycache__/', '\\.git/', '\\.DS_Store$']
app_name: ClassVar[Optional[str]] = None
collect_routes()
Return type:

dict[RouteSignature, Callable[..., Any]]

concurrency_buffer: ClassVar[int | None] = None
concurrency_buffer_perc: ClassVar[int | None] = None
property current_request: RequestContext | None
classmethod get_endpoints()
Return type:

list[str]

classmethod get_health_check_config()
Return type:

Optional[ApplicationHealthCheckConfig]

handle_exit()

Handle exit signal.

health()
host_kwargs: ClassVar[dict[str, Any]] = {'_scheduler': 'nomad', '_scheduler_options': {'storage_region': 'us-east'}, 'keep_alive': 60, 'resolver': 'uv'}
image: ClassVar[Optional[ContainerImage]] = None
isolate_channel: Channel | None = None
kind: ClassVar[Optional[str]] = None
lifespan(app)
local_file_path: ClassVar[Optional[str]] = None
local_python_modules: ClassVar[list[str]] = []
machine_type: ClassVar[str | list[str]] = 'S'
max_concurrency: ClassVar[int | None] = None
max_multiplexing: ClassVar[int | None] = None
min_concurrency: ClassVar[int | None] = None
num_gpus: ClassVar[int | None] = None
provide_hints()

Provide hints for routing the application.

Return type:

list[str]

regions: ClassVar[Optional[list[str]]] = None
request_timeout: ClassVar[int | None] = None
requirements: ClassVar[list[str] | list[list[str]]] = []
classmethod run_local(*args, **kwargs)
scaling_delay: ClassVar[int | None] = None
setup()

Setup the application before serving.

skip_retry_conditions: ClassVar[Optional[list[Literal['timeout', 'server_error', 'connection_error']]]] = None
classmethod spawn()
Return type:

AppSpawnInfo

startup_timeout: ClassVar[int | None] = None
teardown()

Teardown the application after serving.

termination_grace_period_seconds: ClassVar[int | None] = None
class fal.ContainerImage(dockerfile_str, build_args=<factory>, registries=<factory>, builder=None, compression='gzip', force_compression=False, secrets=<factory>, context_dir=PosixPath('/home/runner/work/fal/fal/projects/fal'), dockerignore=None, dockerignore_path=None)

Bases: object

ContainerImage represents a Docker image that can be built from a Dockerfile.

add_dockerignore(patterns=None, path=None)

Add or update dockerignore patterns.

Sets the internal dockerignore patterns using gitignore-style matching. You can provide either a list of patterns or a path to a .dockerignore file.

Parameters:
  • patterns (Optional[List[str]]) – List of gitignore-style patterns

  • path (Optional[PathLike]) – Path to a .dockerignore file

Raises:

ValueError – If both patterns and path are provided, or neither

Return type:

None

build_args: Dict[str, str]
builder: Optional[Literal['depot', 'service', 'worker']] = None
compression: str = 'gzip'
context_dir: PathLike = PosixPath('/home/runner/work/fal/fal/projects/fal')
dockerfile_str: str
dockerignore: Optional[List[str]] = None
dockerignore_path: Optional[PathLike] = None
force_compression: bool = False
classmethod from_dockerfile(path, **kwargs)
Return type:

ContainerImage

classmethod from_dockerfile_str(text, **kwargs)
Return type:

ContainerImage

get_copy_add_sources()

Get list of src paths/patterns from COPY/ADD commands. This method only parses the Dockerfile - it doesn’t access the filesystem.

Return type:

List[str]

Returns:

List of src paths (e.g., [“src/”, “requirements.txt”, “*.py”]) that can be passed to FileSync.sync_files(). Returns empty list if no COPY/ADD commands found.

registries: Dict[str, Dict[str, str]]
secrets: Dict[str, str]
to_dict()
Return type:

dict

class fal.FalServerlessKeyCredentials(key_id, key_secret)

Bases: Credentials

key_id: str
key_secret: str
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

class fal.HealthCheck(*, start_period_seconds=None, timeout_seconds=None, failure_threshold=None, call_regularly=None)

Bases: object

call_regularly: Optional[bool] = None
failure_threshold: Optional[int] = None
start_period_seconds: Optional[int] = None
timeout_seconds: Optional[int] = None
fal.cached(func)

Cache the result of the given function in-memory.

Return type:

Callable[[ParamSpec(ArgsT)], TypeVar(ReturnT, covariant=True)]

fal.endpoint(path, *, is_websocket=False, health_check=None)

Designate the decorated function as an application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

fal.function(kind='virtualenv', *, host=None, local_python_modules=None, **config)
fal.realtime(path, *, buffering=None, session_timeout=None, input_modal=<object object>, output_modal=<object object>, max_batch_size=1, content_type='application/msgpack', encode_message=None, decode_message=None)

Designate the decorated function as a realtime application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

fal.sync_dir(local_dir, remote_dir, force_upload=False)
Return type:

str