fal package

Subpackages

Submodules

fal.app module

class fal.app.App(*, _allow_init=False)

Bases: BaseServable

Create a fal serverless application.

Subclass this to define your application with custom setup, endpoints, and configuration. The App class handles model loading, request routing, and lifecycle management.

Example

>>> class TextToImage(fal.App, machine_type="GPU"):
...     requirements = ["diffusers", "torch"]
...
...     def setup(self):
...         self.pipe = StableDiffusionPipeline.from_pretrained(
...             "runwayml/stable-diffusion-v1-5"
...         )
...
...     @fal.endpoint("/")
...     def generate(self, prompt: str) -> dict:
...         image = self.pipe(prompt).images[0]
...         return {"url": fal.toolkit.upload_image(image)}
requirements

Pip packages to install in the environment. Supports standard pip syntax including version specifiers. Use a list of strings for a single install step, or a list of lists to install in multiple steps. Example: [“numpy==1.24.0”, “torch>=2.0.0”] or [[“setuptools”, “wheel”], [“numpy==1.24.0”]]

local_python_modules

List of local Python module names to include in the deployment. Use for custom code not available on PyPI. Example: [“my_utils”, “models”]

machine_type

Compute instance type for your application. CPU options: ‘XS’, ‘S’ (default), ‘M’, ‘L’. GPU options: ‘GPU-A6000’, ‘GPU-A100’, ‘GPU-H100’, ‘GPU-H200’, ‘GPU-B200’. Use a string for a single type, or a list to define fallback types (tried in order until one is available). Example: “GPU-A100” or [“GPU-H100”, “GPU-A100”]

num_gpus

Number of GPUs to allocate. Only applies to GPU machine types.

regions

Allowed regions for deployment. None means any region. Example: [“us-east”, “eu-west”]

host_kwargs

Advanced configuration dictionary passed to the host. For internal use. Prefer using class attributes instead.

app_name

Custom name for the application. Defaults to class name.

app_auth

Authentication mode. Options: ‘private’ (API key required), ‘public’ (no auth), ‘shared’ (shareable link).

app_files

List of files/directories to include in deployment. Example: [“./models”, “./config.yaml”]

app_files_ignore

Regex patterns to exclude from deployment. Default excludes .pyc, __pycache__, .git, .DS_Store.

app_files_context_dir

Base directory for resolving app_files paths. Defaults to the directory containing the app file.

request_timeout

Maximum seconds for a single request. None for default.

startup_timeout

Maximum seconds for app startup/setup. None for default.

min_concurrency

Minimum warm instances to keep running. Set to 1+ to avoid cold starts. Default is 0 (scale to zero).

max_concurrency

Maximum instances to scale up to.

concurrency_buffer

Additional instances to keep warm above current load.

concurrency_buffer_perc

Percentage buffer of instances above current load.

scaling_delay

Seconds to wait for a request to be picked up by a runner before triggering a scale up. Useful for apps with slow startup times.

max_multiplexing

Maximum concurrent requests per instance.

kind

Deployment kind. For internal use.

image

Custom container image for the application. Use ContainerImage to specify a Dockerfile.

secrets

Names of user secrets to expose to the app as environment variables. When omitted, the server applies the default behavior (follows the user’s preferences). Pass an explicit list to opt in to only the secrets listed. Example: [“OPENAI_API_KEY”, “HF_TOKEN”]

data_mounts

Persistent data mount paths to expose to the application. Use [“/data”] for full access, or specific subdirectories like [“/data/.cache”]. When omitted (None), the server applies a default based on the user model.

app_auth: ClassVar[Optional[Literal['public', 'private', 'shared']]] = None
app_files: ClassVar[list[str]] = []
app_files_context_dir: ClassVar[Optional[str]] = None
app_files_ignore: ClassVar[list[str]] = ['\\.pyc$', '__pycache__/', '\\.git/', '\\.DS_Store$']
app_name: ClassVar[Optional[str]] = None
collect_routes()
Return type:

dict[RouteSignature, Callable[..., Any]]

concurrency_buffer: ClassVar[int | None] = None
concurrency_buffer_perc: ClassVar[int | None] = None
property current_request: RequestContext | None
data_mounts: ClassVar[Optional[list[str]]] = None
classmethod get_endpoints()
Return type:

list[str]

classmethod get_health_check_config()
Return type:

Optional[ApplicationHealthCheckConfig]

handle_exit()

Handle exit signal.

health()
host_kwargs: ClassVar[dict[str, Any]] = {'_scheduler': 'nomad', '_scheduler_options': {'storage_region': 'us-east'}, 'keep_alive': 60, 'resolver': 'uv'}
image: ClassVar[Optional[ContainerImage]] = None
isolate_channel: Channel | None = None
kind: ClassVar[Optional[str]] = None
lifespan(app)
local_file_path: ClassVar[Optional[str]] = None
local_python_modules: ClassVar[list[str]] = []
machine_type: ClassVar[str | list[str]] = 'S'
max_concurrency: ClassVar[int | None] = None
max_multiplexing: ClassVar[int | None] = None
min_concurrency: ClassVar[int | None] = None
num_gpus: ClassVar[int | None] = None
provide_hints()

Provide hints for routing the application.

Return type:

list[str]

regions: ClassVar[Optional[list[str]]] = None
request_timeout: ClassVar[int | None] = None
requirements: ClassVar[list[str] | list[list[str]]] = []
classmethod run_local(*args, **kwargs)
scaling_delay: ClassVar[int | None] = None
secrets: ClassVar[Optional[list[str]]] = None
setup()

Setup the application before serving.

skip_retry_conditions: ClassVar[Optional[list[Literal['timeout', 'server_error', 'connection_error']]]] = None
classmethod spawn()
Return type:

AppSpawnInfo

startup_timeout: ClassVar[int | None] = None
teardown()

Teardown the application after serving.

termination_grace_period_seconds: ClassVar[int | None] = None
class fal.app.AppClient(cls, url, timeout=None)

Bases: object

classmethod connect(cls, app_cls, *, health_request_timeout=30, startup_timeout=60, health_check_interval=0.5)
health()
exception fal.app.AppClientError(message, status_code, headers=<factory>)

Bases: FalServerlessException

headers: dict[str, str]
message: str
status_code: int
class fal.app.AppSpawnInfo(info)

Bases: object

property application
property future
property logs
property stream
property url
wait(*, health_request_timeout=30, startup_timeout=60, health_check_interval=0.5, headers=None)
Return type:

None

class fal.app.EndpointClient(url, endpoint, signature, timeout=None, headers=None)

Bases: object

class fal.app.RequestContext(request_id, endpoint, lifecycle_preference, headers)

Bases: object

endpoint: str | None
headers: dict[str, str]
lifecycle_preference: dict[str, str] | None
request_id: str | None
fal.app.endpoint(path, *, is_websocket=False, health_check=None)

Designate the decorated function as an application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

async fal.app.open_isolate_channel(address)
Return type:

Channel | None

fal.app.wrap_app(cls, **kwargs)
Return type:

IsolatedFunction

fal.apps module

class fal.apps.Completed(logs)

Bases: _Status

Indicates the request has been completed successfully and the result is ready to be retrieved.

logs: list[dict[str, Any]] | None
class fal.apps.InProgress(logs)

Bases: _Status

Indicates the request is now being actively processed, and provides runtime logs for the inference task.

logs: list[dict[str, Any]] | None
class fal.apps.Queued(position)

Bases: _Status

Indicates the request is still in the queue, and provides the position in the queue for ETA calculation.

position: int
class fal.apps.RequestHandle(app_id, request_id, _client=<factory>, _creds=<factory>)

Bases: object

A handle to an async inference request.

app_id: str
cancel()

Cancel an async inference request.

Return type:

None

fetch_raw_response()
Return type:

Response

fetch_result()

Retrieve the result of an async inference request, raises an exception if the request is not completed yet.

Return type:

dict[str, Any]

get()

Retrieve the result of an async inference request, polling the status of the request until it is completed.

Return type:

dict[str, Any]

iter_events(*, logs=False, _RequestHandle__poll_delay=0.2)

Yield all events regarding the given task till its completed.

Return type:

Iterator[_Status]

request_id: str
status(*, logs=False)

Check the status of an async inference request.

Return type:

_Status

fal.apps.run(app_id, arguments, *, path='')

Run an inference task on a Fal app and return the result.

Return type:

dict[str, Any]

fal.apps.stream(app_id, arguments, *, path='')

Stream an inference task on a Fal app.

Return type:

Iterator[str | bytes]

fal.apps.submit(app_id, arguments, *, path='')

Submit an async inference task to the app. Returns a request handle which can be used to check the status of the request and retrieve the result.

Return type:

RequestHandle

fal.apps.ws(app_id, *, path='')

Connect to a HTTP endpoint but with websocket protocol. This is an internal and experimental API, use it at your own risk.

Return type:

Iterator[_WSConnection]

fal.compat module

async fal.compat.run_in_thread(func, *args, **kwargs)

Run sync code on a worker thread with Python 3.8+ support.

Return type:

Any

fal.config module

class fal.config.Config(*, validate_profile=False, profile=None)

Bases: object

DEFAULT_CONFIG_PATH = '~/.fal/config.toml'
delete_profile(profile)
Return type:

None

edit()
Return type:

Iterator[Config]

get(key)
Return type:

Optional[str]

get_internal(key)
Return type:

Optional[str]

property profile: str | None
profiles()
Return type:

List[str]

save()
Return type:

None

set(key, value)
Return type:

None

set_internal(key, value)
Return type:

None

unset(key)
Return type:

None

unset_internal(key)
Return type:

None

fal.container module

class fal.container.ContainerImage(dockerfile_str, build_args=<factory>, registries=<factory>, builder=None, compression='gzip', force_compression=False, secrets=<factory>, context_dir=PosixPath('/home/runner/work/fal/fal/projects/fal'), dockerignore=None, dockerignore_path=None)

Bases: object

ContainerImage represents a Docker image that can be built from a Dockerfile.

add_dockerignore(patterns=None, path=None)

Add or update dockerignore patterns.

Sets the internal dockerignore patterns using gitignore-style matching. You can provide either a list of patterns or a path to a .dockerignore file.

Parameters:
  • patterns (Optional[List[str]]) – List of gitignore-style patterns

  • path (Optional[PathLike]) – Path to a .dockerignore file

Raises:

ValueError – If both patterns and path are provided, or neither

Return type:

None

build_args: Dict[str, str]
builder: Optional[Literal['depot', 'service', 'worker']] = None
compression: str = 'gzip'
context_dir: PathLike = PosixPath('/home/runner/work/fal/fal/projects/fal')
dockerfile_str: str
dockerignore: Optional[List[str]] = None
dockerignore_path: Optional[PathLike] = None
force_compression: bool = False
classmethod from_dockerfile(path, **kwargs)
Return type:

ContainerImage

classmethod from_dockerfile_str(text, **kwargs)
Return type:

ContainerImage

get_copy_add_sources()

Get list of src paths/patterns from COPY/ADD commands. This method only parses the Dockerfile - it doesn’t access the filesystem.

Return type:

List[str]

Returns:

List of src paths (e.g., [“src/”, “requirements.txt”, “*.py”]) that can be passed to FileSync.sync_files(). Returns empty list if no COPY/ADD commands found.

registries: Dict[str, Dict[str, str]]
secrets: Dict[str, str]
to_dict()
Return type:

dict

class fal.container.DockerfileParser(content)

Bases: object

content: str
normalized_content: str
parse_copy_add_sources()
Parse COPY and ADD commands to extract source paths.
  • Skips COPY –from=… (multi-stage builds)

  • Skips ADD with URLs (http://, https://)

  • Normalizes absolute paths by stripping leading slash (Docker treats them as relative to the build context)

  • Handles both shell form and JSON form

Return type:

List[str]

Returns:

List of source paths/patterns referenced in COPY/ADD commands.

class fal.container.DockerignoreHandler(context_dir=None, dockerignore=None, dockerignore_path=None)

Bases: object

context_dir: Optional[PathLike] = None
dockerignore: Optional[List[str]] = None
dockerignore_path: Optional[PathLike] = None
get_patterns()

Get list of ignore patterns.

Priority (highest to lowest): 1. Explicit dockerignore list 2. Explicit path to the .dockerignore file 3. .dockerignore file in the context directory 4. Default ignore patterns

Return type:

List[str]

Returns:

List of ignore patterns

get_regex_patterns()
Return type:

List[str]

fal.file_sync module

class fal.file_sync.FileMetadata(size, mtime, mode, hash, relative_path, absolute_path)

Bases: object

absolute_path: str
classmethod from_path(file_path, *, relative, absolute)
Return type:

FileMetadata

hash: str
mode: int
mtime: float
relative_path: str
size: int
to_dict()
Return type:

Dict[str, str]

class fal.file_sync.FileSync(local_file_path, credentials=None)

Bases: object

check_hashes_on_server(hashes)
Return type:

List[str]

close()
collect_files(paths, files_context_dir=None)
sync_files(paths, chunk_size=10485760, max_concurrency_uploads=10, files_ignore=[], files_context_dir=None)
Return type:

Tuple[List[FileMetadata], List[AppFileUploadException]]

upload_file_multipart(file_path, metadata, chunk_size=10485760)
Return type:

str

class fal.file_sync.FileSyncOptions(files_list, files_ignore, files_context_dir)

Bases: object

files_context_dir: Optional[str]
files_ignore: List[Pattern]
files_list: List[str]
classmethod from_options(options)
Return type:

FileSyncOptions

fal.file_sync.compute_hash(file_path, mode)
Return type:

str

fal.file_sync.normalize_path(path_str, base_path_str, files_context_dir=None)
Return type:

Tuple[str, str]

fal.file_sync.print_path_tree(file_paths)
fal.file_sync.sanitize_relative_path(rel_path, original_path)
Return type:

str

fal.files module

class fal.files.FalFileSystem(*, host=None, team=None, profile=None, **kwargs)

Bases: AbstractFileSystem

get_file(rpath, lpath, **kwargs)

Copy single remote file to local

info(path, **kwargs)

Give details of entry at path

Returns a single dictionary, with exactly the same information as ls would with detail=True.

The default implementation calls ls and could be overridden by a shortcut. kwargs are passed on to `ls().

Some file systems might not be able to measure the file’s size, in which case, the returned dict will include 'size': None.

Returns:

  • dict with keys (name (full path in the FS), size (in bytes), type (file,)

  • directory, or something else) and other FS-specific keys.

ls(path, detail=True, **kwargs)

List objects at path.

This should include subdirectories and files at that location. The difference between a file and a directory must be clear when details are requested.

The specific keys, or perhaps a FileInfo class, or similar, is TBD, but must be consistent across implementations. Must include:

  • full path to the entry (without protocol)

  • size of the entry, in bytes. If the value cannot be determined, will be None.

  • type of entry, “file”, “directory” or other

Additional information may be present, appropriate to the file-system, e.g., generation, checksum, etc.

May use refresh=True|False to allow use of self._ls_from_cache to check for a saved listing and avoid calling the backend. This would be common where listing may be expensive.

Parameters:
  • path (str)

  • detail (bool) – if True, gives a list of dictionaries, where each is the same as the result of info(path). If False, gives a list of paths (str).

  • kwargs (may have additional backend-specific options, such as version) – information

Returns:

  • List of strings if detail is False, or list of directory information

  • dicts if detail is True.

mv(path1, path2, recursive=False, maxdepth=None, **kwargs)

Move file(s) from one location to another

put_file(lpath, rpath, mode='overwrite', **kwargs)

Copy single file to remote

put_file_from_url(url, rpath, mode='overwrite', **kwargs)
rename(path, destination, **kwargs)

Alias of AbstractFileSystem.mv.

rm(path, **kwargs)

Delete files.

Parameters:
  • path (str or list of str) – File(s) to delete.

  • recursive (bool) – If file(s) are directories, recursively delete contents and then also remove the directory

  • maxdepth (int or None) – Depth to pass to walk for finding files to delete, if recursive. If None, there will be no limit and infinite recursion may be possible.

fal.flags module

fal.flags.bool_envvar(name)

fal.project module

fal.project.find_project_root(srcs)

Return a directory containing .git, or pyproject.toml.

That directory will be a common parent of all files and directories passed in srcs.

If no directory in the tree contains a marker that would specify it’s the project root, the root of the file system is returned.

Returns a two-tuple with the first element as the project root path and the second element as a string describing the method by which the project root was discovered.

Return type:

Tuple[Path, str]

fal.project.find_pyproject_toml(path_search_start=None)

Find the absolute filepath to a pyproject.toml if it exists

Return type:

Optional[str]

fal.project.parse_pyproject_toml(path_config)

Parse a pyproject toml file, pulling out relevant parts for fal.

If parsing fails, will raise a tomli.TOMLDecodeError.

Return type:

Dict[str, Any]

fal.realtime module

fal.realtime.msgpack_decode_message(message)
Return type:

Any

fal.realtime.msgpack_encode_message(message)
Return type:

bytes

fal.realtime.realtime(path, *, buffering=None, session_timeout=None, input_modal=<object object>, output_modal=<object object>, max_batch_size=1, content_type='application/msgpack', encode_message=None, decode_message=None)

Designate the decorated function as a realtime application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

fal.ref module

fal.ref.get_current_app()
Return type:

Optional[App]

fal.ref.set_current_app(app)

fal.sdk module

class fal.sdk.AliasInfo(alias, revision, auth_mode, keep_alive, max_concurrency, max_multiplexing, active_runners, min_concurrency, concurrency_buffer, concurrency_buffer_perc, scaling_delay, machine_types, request_timeout, startup_timeout, valid_regions, environment_name=None)

Bases: object

active_runners: int
alias: str
auth_mode: str
concurrency_buffer: int
concurrency_buffer_perc: int
environment_name: str | None = None
keep_alive: int
machine_types: list[str]
max_concurrency: int
max_multiplexing: int
min_concurrency: int
request_timeout: int
revision: str
scaling_delay: int
startup_timeout: int
valid_regions: list[str]
class fal.sdk.ApplicationHealthCheckConfig(path, start_period_seconds, timeout_seconds, failure_threshold, call_regularly)

Bases: object

call_regularly: Optional[bool]
failure_threshold: Optional[int]
path: str
start_period_seconds: Optional[int]
timeout_seconds: Optional[int]
class fal.sdk.ApplicationInfo(application_id, keep_alive, max_concurrency, max_multiplexing, active_runners, min_concurrency, concurrency_buffer, concurrency_buffer_perc, scaling_delay, machine_types, request_timeout, startup_timeout, valid_regions, created_at, environment_name=None)

Bases: object

active_runners: int
application_id: str
concurrency_buffer: int
concurrency_buffer_perc: int
created_at: datetime
environment_name: str | None = None
keep_alive: int
machine_types: list[str]
max_concurrency: int
max_multiplexing: int
min_concurrency: int
request_timeout: int
scaling_delay: int
startup_timeout: int
valid_regions: list[str]
class fal.sdk.AuthenticatedCredentials(user=<factory>, team=None)

Bases: Credentials

team: str | None = None
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

user: UserAccess
class fal.sdk.Credentials

Bases: object

server_credentials: ServerCredentials = <fal.sdk.RemoteCredentials object>
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

class fal.sdk.DeploymentStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

RECREATE = 'recreate'
ROLLING = 'rolling'
static from_proto(proto)
Return type:

DeploymentStrategy

to_proto()
Return type:

int

class fal.sdk.EnvironmentInfo(name, description, is_default, created_at)

Bases: object

created_at: datetime
description: str | None
is_default: bool
name: str
class fal.sdk.FalServerlessClient(hostname, credentials=<factory>)

Bases: object

connect()
Return type:

FalServerlessConnection

credentials: Credentials
hostname: str
class fal.sdk.FalServerlessConnection(hostname, credentials, _stack=<factory>, _stub=None)

Bases: object

close()
create_alias(alias, revision, auth_mode, *, environment_name=None, deployment_strategy=None)
Return type:

AliasInfo

create_environment(name, description=None)
Return type:

EnvironmentInfo

create_user_key(scope, alias)
Return type:

tuple[str, str]

credentials: Credentials
define_environment(kind, force=False, **options)
Return type:

EnvironmentDefinition

delete_alias(alias, *, environment_name=None)
Return type:

str | None

delete_application(application_id)
Return type:

None

delete_environment(name)
Return type:

None

delete_secret(name, *, environment_name=None)
Return type:

None

hostname: str
kill_runner(runner_id)
Return type:

None

list_alias_runners(alias, *, list_pending=True, start_time=None, environment_name=None)
Return type:

list[RunnerInfo]

list_aliases(*, environment_name=None)
Return type:

list[AliasInfo]

list_applications(application_name=None, *, environment_name=None)
Return type:

list[ApplicationInfo]

list_environments()
Return type:

list[EnvironmentInfo]

list_runners(start_time=None)
Return type:

list[RunnerInfo]

list_secrets(*, environment_name=None)
Return type:

list[ServerlessSecret]

list_user_keys()
Return type:

list[UserKeyInfo]

register(function, environments, application_name=None, auth_mode=None, *, source_code=None, health_check_config=None, serialization_method='cloudpickle', machine_requirements=None, metadata=None, deployment_strategy, scale=True, private_logs=None, files=None, skip_retry_conditions=None, environment_name=None, termination_grace_period_seconds=None, secrets=None, data_mounts=None)
Return type:

Iterator[RegisterApplicationResult]

revoke_user_key(key_id)
Return type:

None

rollout_application(application_name, force=False, *, environment_name=None)
Return type:

None

run(function, environments, *, serialization_method='cloudpickle', machine_requirements=None, setup_function=None, files=None, application_name=None, auth_mode=None, environment_name=None, secrets=None, data_mounts=None)
Return type:

Iterator[HostedRunResult[TypeVar(ResultT)]]

scale(application_name, max_concurrency=None)
Return type:

None

set_secret(name, value, *, environment_name=None)
Return type:

None

stop_runner(runner_id, replace_first=False)
Return type:

None

property stub: IsolateControllerStub
update_application(application_name, keep_alive=None, max_multiplexing=None, max_concurrency=None, min_concurrency=None, concurrency_buffer=None, concurrency_buffer_perc=None, scaling_delay=None, request_timeout=None, startup_timeout=None, valid_regions=None, machine_types=None, *, environment_name=None)
Return type:

AliasInfo

class fal.sdk.FalServerlessKeyCredentials(key_id, key_secret)

Bases: Credentials

key_id: str
key_secret: str
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

class fal.sdk.File(hash, relative_path)

Bases: object

hash: str
relative_path: str
class fal.sdk.HealthCheck(*, start_period_seconds=None, timeout_seconds=None, failure_threshold=None, call_regularly=None)

Bases: object

call_regularly: Optional[bool] = None
failure_threshold: Optional[int] = None
start_period_seconds: Optional[int] = None
timeout_seconds: Optional[int] = None
class fal.sdk.HostedRunResult(run_id, status, logs=<factory>, result=None, stream=None, service_urls=None)

Bases: Generic[ResultT]

logs: list[Log]
result: Optional[TypeVar(ResultT)] = None
run_id: str
service_urls: ServiceURLs | None = None
status: HostedRunStatus
stream: Any = None
class fal.sdk.HostedRunState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

INTERNAL_FAILURE = 2
IN_PROGRESS = 0
SUCCESS = 1
class fal.sdk.HostedRunStatus(state)

Bases: object

state: HostedRunState
class fal.sdk.KeyScope(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

ADMIN = 'ADMIN'
API = 'API'
static from_proto(proto)
Return type:

KeyScope

class fal.sdk.LocalCredentials

Bases: ServerCredentials

to_grpc()
Return type:

ChannelCredentials

class fal.sdk.MachineRequirements(machine_types, num_gpus=None, keep_alive=10, base_image=None, exposed_port=None, scheduler=None, scheduler_options=None, max_concurrency=None, max_multiplexing=None, min_concurrency=None, concurrency_buffer=None, concurrency_buffer_perc=None, scaling_delay=None, request_timeout=None, startup_timeout=None, valid_regions=None)

Bases: object

base_image: str | None = None
concurrency_buffer: int | None = None
concurrency_buffer_perc: int | None = None
exposed_port: int | None = None
keep_alive: int = 10
machine_types: list[str]
max_concurrency: int | None = None
max_multiplexing: int | None = None
min_concurrency: int | None = None
num_gpus: int | None = None
request_timeout: int | None = None
scaling_delay: int | None = None
scheduler: str | None = None
scheduler_options: dict[str, Any] | None = None
startup_timeout: int | None = None
valid_regions: list[str] | None = None
class fal.sdk.RegisterApplicationResult(result, logs=<factory>, service_urls=None)

Bases: object

logs: list[Log]
result: RegisterApplicationResultType | None
service_urls: ServiceURLs | None = None
class fal.sdk.RegisterApplicationResultType(application_id)

Bases: object

application_id: str
class fal.sdk.RemoteCredentials

Bases: ServerCredentials

to_grpc()
Return type:

ChannelCredentials

class fal.sdk.ReplaceState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

DID_REPLACE = 'DID_REPLACE'
NO_REPLACE = 'NO_REPLACE'
WILL_REPLACE = 'WILL_REPLACE'
class fal.sdk.RunnerInfo(runner_id, in_flight_requests, expiration_countdown, uptime, external_metadata, revision, alias, state, machine_type, replacement=ReplaceState.NO_REPLACE)

Bases: object

alias: str
expiration_countdown: Optional[int]
external_metadata: dict[str, Any]
in_flight_requests: int
machine_type: str
replacement: ReplaceState = 'NO_REPLACE'
revision: str
runner_id: str
state: RunnerState
uptime: timedelta
class fal.sdk.RunnerState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

DEAD = 'DEAD'
DOCKER_PULL = 'DOCKER_PULL'
DRAINING = 'DRAINING'
FAILURE_DELAY = 'FAILURE_DELAY'
IDLE = 'IDLE'
PENDING = 'PENDING'
RUNNING = 'RUNNING'
SETUP = 'SETUP'
TERMINATED = 'TERMINATED'
TERMINATING = 'TERMINATING'
class fal.sdk.ServerCredentials

Bases: object

property base_options: dict[str, str | int]
to_grpc()
Return type:

ChannelCredentials

class fal.sdk.ServerlessSecret(name, created_at, environment_name=None)

Bases: object

created_at: datetime
environment_name: str | None = None
name: str
class fal.sdk.ServiceURLs(playground, run, queue, ws, log)

Bases: object

log: str
playground: str
queue: str
run: str
ws: str
class fal.sdk.UserKeyInfo(key_id, created_at, scope, alias)

Bases: object

alias: str
created_at: datetime
key_id: str
scope: KeyScope
class fal.sdk.WorkerStatus(worker_id, start_time, end_time, duration, user_id, machine_type)

Bases: object

duration: timedelta
end_time: datetime
machine_type: str
start_time: datetime
user_id: str
worker_id: str
fal.sdk.construct_alias(base_name, environment_name=None)

Construct the full alias with environment suffix.

Examples: - (“my-app”, None) → “my-app” - (“my-app”, “main”) → “my-app” - (“my-app”, “staging”) → “my-app–staging”

Return type:

str

fal.sdk.deconstruct_alias(full_alias, environment_name=None)

Extract base name from full alias for display.

Examples: - (“my-app–staging”, “staging”) → “my-app” - (“my-app”, “main”) → “my-app” - (“my-app”, None) → “my-app”

Return type:

str

fal.sdk.get_agent_credentials(original_credentials)

If running inside a fal Serverless box, use the preconfigured credentials instead of the user provided ones.

Return type:

Credentials

fal.sdk.get_credentials(team=None, key=None, profile=None)
Return type:

Credentials

fal.sdk.get_default_server_credentials()
Return type:

ServerCredentials

fal.sync module

fal.sync.sync_dir(local_dir, remote_dir, force_upload=False)
Return type:

str

fal.upload module

class fal.upload.AppFileMultipartUpload(client, file_hash, metadata, chunk_size=10485760, max_concurrency=10)

Bases: BaseMultipartUpload

property cancel_url: str | None
property complete_url: str
get_complete_payload(parts)
Return type:

dict

get_initiate_payload()
Return type:

Optional[dict]

property initiate_url: str
property part_url: str
class fal.upload.BaseMultipartUpload(client, chunk_size=10485760, max_concurrency=10)

Bases: object

cancel()
Return type:

None

property cancel_url: str | None
complete()
Return type:

str

property complete_url: str
get_complete_payload(parts)
Return type:

dict

get_initiate_payload()
Return type:

Optional[dict]

initiate()
Return type:

str

property initiate_url: str
property part_url: str
upload_file(file_path, on_part_complete=None)
Return type:

str

property upload_id: str
class fal.upload.DataFileMultipartUpload(client, target_path, chunk_size=10485760, max_concurrency=10)

Bases: BaseMultipartUpload

property cancel_url: str | None
property complete_url: str
property initiate_url: str
property part_url: str

fal.utils module

class fal.utils.LoadedFunction(function, endpoints, app_name, app_auth, source_code, class_name=None)

Bases: object

app_auth: Optional[Literal['public', 'private', 'shared']]
app_name: str | None
class_name: str | None = None
endpoints: list[str]
function: IsolatedFunction
source_code: str | None
fal.utils.load_function_from(host, file_path, function_name=None, *, force_env_build=False, options=None, app_name=None, app_auth=None, limit_max_requests=None)
Return type:

LoadedFunction

fal.workflows module

class fal.workflows.AttributeLeaf(leaf, attribute)

Bases: Leaf

attribute: str
execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

leaf: Leaf
property referee: ReferenceLeaf
class fal.workflows.Context(vars)

Bases: object

hydrate(input)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

vars: dict[str, Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]
class fal.workflows.Display(id, depends, fields)

Bases: Node

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

fields: list[Leaf]
classmethod from_json(data)
Return type:

Display

to_json()
Return type:

dict[str, Any]

class fal.workflows.IndexLeaf(leaf, index)

Bases: Leaf

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

index: int
leaf: Leaf
property referee: ReferenceLeaf
class fal.workflows.Leaf

Bases: object

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

property referee: ReferenceLeaf
exception fal.workflows.MisconfiguredGraphError

Bases: WorkflowSyntaxError

class fal.workflows.Node(id, depends)

Bases: object

depends: set[str]
execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

classmethod from_json(data)
Return type:

Node

id: str
to_json()
Return type:

dict[str, Any]

class fal.workflows.ReferenceLeaf(id)

Bases: Leaf

execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

id: str
property referee: ReferenceLeaf
class fal.workflows.Run(id, depends, app, input)

Bases: Node

app: str
execute(context)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

classmethod from_json(data)
Return type:

Run

input: Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]
to_json()
Return type:

dict[str, Any]

class fal.workflows.Workflow(name, input_schema, output_schema, nodes=<factory>, output=None, _app_counter=<factory>)

Bases: object

display(*fields)
Return type:

None

execute(input)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

classmethod from_json(data)
Return type:

Workflow

property input: ReferenceLeaf
input_schema: Dict[str, Any]
name: str
nodes: dict[str, Node]
output: dict[str, Any] | None = None
output_schema: Dict[str, Any]
publish(title, *, is_public=True)
run(app, input)
Return type:

ReferenceLeaf

set_output(output)
Return type:

None

to_dict()
Return type:

dict[str, Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]

to_json()
Return type:

dict[str, Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]

exception fal.workflows.WorkflowSyntaxError

Bases: FalServerlessException

fal.workflows.create_workflow(name, input, output)
Return type:

Workflow

fal.workflows.depends(data)
Return type:

set[str]

fal.workflows.export_workflow_json(data)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

fal.workflows.import_workflow_json(data)
Return type:

Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]

fal.workflows.iter_leaves(data)
Return type:

Iterator[Union[Dict[str, Any], List[Any], str, int, float, bool, None, Leaf]]

fal.workflows.main()
Return type:

None

fal.workflows.parse_leaf(raw_leaf)

Parses a leaf (which is in the form of $variable.field.field_2[index] etc.) into a tree of Leaf objects.

Return type:

Leaf

Module contents

class fal.App(*, _allow_init=False)

Bases: BaseServable

Create a fal serverless application.

Subclass this to define your application with custom setup, endpoints, and configuration. The App class handles model loading, request routing, and lifecycle management.

Example

>>> class TextToImage(fal.App, machine_type="GPU"):
...     requirements = ["diffusers", "torch"]
...
...     def setup(self):
...         self.pipe = StableDiffusionPipeline.from_pretrained(
...             "runwayml/stable-diffusion-v1-5"
...         )
...
...     @fal.endpoint("/")
...     def generate(self, prompt: str) -> dict:
...         image = self.pipe(prompt).images[0]
...         return {"url": fal.toolkit.upload_image(image)}
requirements

Pip packages to install in the environment. Supports standard pip syntax including version specifiers. Use a list of strings for a single install step, or a list of lists to install in multiple steps. Example: [“numpy==1.24.0”, “torch>=2.0.0”] or [[“setuptools”, “wheel”], [“numpy==1.24.0”]]

local_python_modules

List of local Python module names to include in the deployment. Use for custom code not available on PyPI. Example: [“my_utils”, “models”]

machine_type

Compute instance type for your application. CPU options: ‘XS’, ‘S’ (default), ‘M’, ‘L’. GPU options: ‘GPU-A6000’, ‘GPU-A100’, ‘GPU-H100’, ‘GPU-H200’, ‘GPU-B200’. Use a string for a single type, or a list to define fallback types (tried in order until one is available). Example: “GPU-A100” or [“GPU-H100”, “GPU-A100”]

num_gpus

Number of GPUs to allocate. Only applies to GPU machine types.

regions

Allowed regions for deployment. None means any region. Example: [“us-east”, “eu-west”]

host_kwargs

Advanced configuration dictionary passed to the host. For internal use. Prefer using class attributes instead.

app_name

Custom name for the application. Defaults to class name.

app_auth

Authentication mode. Options: ‘private’ (API key required), ‘public’ (no auth), ‘shared’ (shareable link).

app_files

List of files/directories to include in deployment. Example: [“./models”, “./config.yaml”]

app_files_ignore

Regex patterns to exclude from deployment. Default excludes .pyc, __pycache__, .git, .DS_Store.

app_files_context_dir

Base directory for resolving app_files paths. Defaults to the directory containing the app file.

request_timeout

Maximum seconds for a single request. None for default.

startup_timeout

Maximum seconds for app startup/setup. None for default.

min_concurrency

Minimum warm instances to keep running. Set to 1+ to avoid cold starts. Default is 0 (scale to zero).

max_concurrency

Maximum instances to scale up to.

concurrency_buffer

Additional instances to keep warm above current load.

concurrency_buffer_perc

Percentage buffer of instances above current load.

scaling_delay

Seconds to wait for a request to be picked up by a runner before triggering a scale up. Useful for apps with slow startup times.

max_multiplexing

Maximum concurrent requests per instance.

kind

Deployment kind. For internal use.

image

Custom container image for the application. Use ContainerImage to specify a Dockerfile.

secrets

Names of user secrets to expose to the app as environment variables. When omitted, the server applies the default behavior (follows the user’s preferences). Pass an explicit list to opt in to only the secrets listed. Example: [“OPENAI_API_KEY”, “HF_TOKEN”]

data_mounts

Persistent data mount paths to expose to the application. Use [“/data”] for full access, or specific subdirectories like [“/data/.cache”]. When omitted (None), the server applies a default based on the user model.

app_auth: ClassVar[Optional[Literal['public', 'private', 'shared']]] = None
app_files: ClassVar[list[str]] = []
app_files_context_dir: ClassVar[Optional[str]] = None
app_files_ignore: ClassVar[list[str]] = ['\\.pyc$', '__pycache__/', '\\.git/', '\\.DS_Store$']
app_name: ClassVar[Optional[str]] = None
collect_routes()
Return type:

dict[RouteSignature, Callable[..., Any]]

concurrency_buffer: ClassVar[int | None] = None
concurrency_buffer_perc: ClassVar[int | None] = None
property current_request: RequestContext | None
data_mounts: ClassVar[Optional[list[str]]] = None
classmethod get_endpoints()
Return type:

list[str]

classmethod get_health_check_config()
Return type:

Optional[ApplicationHealthCheckConfig]

handle_exit()

Handle exit signal.

health()
host_kwargs: ClassVar[dict[str, Any]] = {'_scheduler': 'nomad', '_scheduler_options': {'storage_region': 'us-east'}, 'keep_alive': 60, 'resolver': 'uv'}
image: ClassVar[Optional[ContainerImage]] = None
isolate_channel: Channel | None = None
kind: ClassVar[Optional[str]] = None
lifespan(app)
local_file_path: ClassVar[Optional[str]] = None
local_python_modules: ClassVar[list[str]] = []
machine_type: ClassVar[str | list[str]] = 'S'
max_concurrency: ClassVar[int | None] = None
max_multiplexing: ClassVar[int | None] = None
min_concurrency: ClassVar[int | None] = None
num_gpus: ClassVar[int | None] = None
provide_hints()

Provide hints for routing the application.

Return type:

list[str]

regions: ClassVar[Optional[list[str]]] = None
request_timeout: ClassVar[int | None] = None
requirements: ClassVar[list[str] | list[list[str]]] = []
classmethod run_local(*args, **kwargs)
scaling_delay: ClassVar[int | None] = None
secrets: ClassVar[Optional[list[str]]] = None
setup()

Setup the application before serving.

skip_retry_conditions: ClassVar[Optional[list[Literal['timeout', 'server_error', 'connection_error']]]] = None
classmethod spawn()
Return type:

AppSpawnInfo

startup_timeout: ClassVar[int | None] = None
teardown()

Teardown the application after serving.

termination_grace_period_seconds: ClassVar[int | None] = None
class fal.ContainerImage(dockerfile_str, build_args=<factory>, registries=<factory>, builder=None, compression='gzip', force_compression=False, secrets=<factory>, context_dir=PosixPath('/home/runner/work/fal/fal/projects/fal'), dockerignore=None, dockerignore_path=None)

Bases: object

ContainerImage represents a Docker image that can be built from a Dockerfile.

add_dockerignore(patterns=None, path=None)

Add or update dockerignore patterns.

Sets the internal dockerignore patterns using gitignore-style matching. You can provide either a list of patterns or a path to a .dockerignore file.

Parameters:
  • patterns (Optional[List[str]]) – List of gitignore-style patterns

  • path (Optional[PathLike]) – Path to a .dockerignore file

Raises:

ValueError – If both patterns and path are provided, or neither

Return type:

None

build_args: Dict[str, str]
builder: Optional[Literal['depot', 'service', 'worker']] = None
compression: str = 'gzip'
context_dir: PathLike = PosixPath('/home/runner/work/fal/fal/projects/fal')
dockerfile_str: str
dockerignore: Optional[List[str]] = None
dockerignore_path: Optional[PathLike] = None
force_compression: bool = False
classmethod from_dockerfile(path, **kwargs)
Return type:

ContainerImage

classmethod from_dockerfile_str(text, **kwargs)
Return type:

ContainerImage

get_copy_add_sources()

Get list of src paths/patterns from COPY/ADD commands. This method only parses the Dockerfile - it doesn’t access the filesystem.

Return type:

List[str]

Returns:

List of src paths (e.g., [“src/”, “requirements.txt”, “*.py”]) that can be passed to FileSync.sync_files(). Returns empty list if no COPY/ADD commands found.

registries: Dict[str, Dict[str, str]]
secrets: Dict[str, str]
to_dict()
Return type:

dict

class fal.FalServerlessKeyCredentials(key_id, key_secret)

Bases: Credentials

key_id: str
key_secret: str
to_grpc()
Return type:

ChannelCredentials

to_headers()
Return type:

dict[str, str]

class fal.HealthCheck(*, start_period_seconds=None, timeout_seconds=None, failure_threshold=None, call_regularly=None)

Bases: object

call_regularly: Optional[bool] = None
failure_threshold: Optional[int] = None
start_period_seconds: Optional[int] = None
timeout_seconds: Optional[int] = None
fal.cached(func)

Cache the result of the given function in-memory.

Return type:

Callable[[ParamSpec(ArgsT)], TypeVar(ReturnT, covariant=True)]

fal.endpoint(path, *, is_websocket=False, health_check=None)

Designate the decorated function as an application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

fal.function(kind='virtualenv', *, host=None, local_python_modules=None, **config)
fal.realtime(path, *, buffering=None, session_timeout=None, input_modal=<object object>, output_modal=<object object>, max_batch_size=1, content_type='application/msgpack', encode_message=None, decode_message=None)

Designate the decorated function as a realtime application endpoint.

Return type:

Callable[[TypeVar(EndpointT, bound= Callable[..., Any])], TypeVar(EndpointT, bound= Callable[..., Any])]

fal.sync_dir(local_dir, remote_dir, force_upload=False)
Return type:

str