fal package¶
Subpackages¶
- fal.api package
- Submodules
- fal.api.api module
BaseServableFalFastAPIFalMissingDependencyErrorFalSerializationErrorFalServerFalServerlessErrorFalServerlessHostHostInternalFalServerlessErrorIsolatedFunctionLocalHostOptionsRouteSignatureRouteSignature.bufferingRouteSignature.content_typeRouteSignature.decode_messageRouteSignature.emit_timingsRouteSignature.encode_messageRouteSignature.health_checkRouteSignature.input_modalRouteSignature.is_websocketRouteSignature.max_batch_sizeRouteSignature.output_modalRouteSignature.pathRouteSignature.realtime_modeRouteSignature.session_timeout
ServeWrapperServedIsolatedFunctionSpawnInfoUserFunctionExceptioncached()find_missing_dependencies()function()merge_basic_config()
- fal.api.apps module
- fal.api.client module
- fal.api.deploy module
- fal.api.environments module
- fal.api.keys module
- fal.api.runners module
- fal.api.secrets module
- Module contents
- fal.auth package
- fal.cli package
- Submodules
- fal.cli.api module
- fal.cli.apps module
- fal.cli.auth module
- fal.cli.cli_nested_json module
- fal.cli.create module
- fal.cli.debug module
- fal.cli.deploy module
- fal.cli.doctor module
- fal.cli.environments module
- fal.cli.files module
- fal.cli.keys module
- fal.cli.main module
- fal.cli.parser module
- fal.cli.profile module
- fal.cli.queue module
- fal.cli.run module
- fal.cli.runners module
- fal.cli.secrets module
- fal.cli.teams module
- Module contents
- fal.console package
- fal.distributed package
- Submodules
- fal.distributed.utils module
- fal.distributed.worker module
DistributedRunnerDistributedRunner.close_zmq_socket()DistributedRunner.contextDistributedRunner.ensure_alive()DistributedRunner.gather_errors()DistributedRunner.get_zmq_socket()DistributedRunner.invoke()DistributedRunner.is_alive()DistributedRunner.keepalive()DistributedRunner.keepalive_timerDistributedRunner.maybe_cancel_keepalive()DistributedRunner.maybe_reset_keepalive()DistributedRunner.maybe_start_keepalive()DistributedRunner.run()DistributedRunner.start()DistributedRunner.stop()DistributedRunner.stream()DistributedRunner.terminate()DistributedRunner.zmq_socket
DistributedWorkerDistributedWorker.add_streaming_error()DistributedWorker.add_streaming_result()DistributedWorker.deviceDistributedWorker.initialize()DistributedWorker.loopDistributedWorker.queueDistributedWorker.rank_print()DistributedWorker.run_in_worker()DistributedWorker.runningDistributedWorker.setup()DistributedWorker.shutdown()DistributedWorker.submit()DistributedWorker.teardown()DistributedWorker.thread
- Module contents
DistributedRunnerDistributedRunner.close_zmq_socket()DistributedRunner.contextDistributedRunner.ensure_alive()DistributedRunner.gather_errors()DistributedRunner.get_zmq_socket()DistributedRunner.invoke()DistributedRunner.is_alive()DistributedRunner.keepalive()DistributedRunner.keepalive_timerDistributedRunner.maybe_cancel_keepalive()DistributedRunner.maybe_reset_keepalive()DistributedRunner.maybe_start_keepalive()DistributedRunner.run()DistributedRunner.start()DistributedRunner.stop()DistributedRunner.stream()DistributedRunner.terminate()DistributedRunner.zmq_socket
DistributedWorkerDistributedWorker.add_streaming_error()DistributedWorker.add_streaming_result()DistributedWorker.deviceDistributedWorker.initialize()DistributedWorker.loopDistributedWorker.queueDistributedWorker.rank_print()DistributedWorker.run_in_worker()DistributedWorker.runningDistributedWorker.setup()DistributedWorker.shutdown()DistributedWorker.submit()DistributedWorker.teardown()DistributedWorker.thread
- fal.exceptions package
- Submodules
- fal.exceptions.auth module
- fal.exceptions.gpu module
- Module contents
- fal.logging package
- fal.toolkit package
- Subpackages
- Submodules
- fal.toolkit.compilation module
- fal.toolkit.exceptions module
- fal.toolkit.kv module
- fal.toolkit.pydantic module
- fal.toolkit.types module
- Module contents
AudioAudioField()CompressedFileDownloadErrorFalBaseModelFalTookitExceptionField()FileFileField()FileUploadExceptionHidden()ImageImageField()KVStoreKVStoreExceptionVideoVideoField()clone_repository()download_file()download_model_weights()get_gpu_type()get_image_size()load_inductor_cache()sync_inductor_cache()synchronized_inductor_cache()
Submodules¶
fal.app module¶
- class fal.app.App(*, _allow_init=False)¶
Bases:
BaseServableCreate a fal serverless application.
Subclass this to define your application with custom setup, endpoints, and configuration. The App class handles model loading, request routing, and lifecycle management.
Example
>>> class TextToImage(fal.App, machine_type="GPU"): ... requirements = ["diffusers", "torch"] ... ... def setup(self): ... self.pipe = StableDiffusionPipeline.from_pretrained( ... "runwayml/stable-diffusion-v1-5" ... ) ... ... @fal.endpoint("/") ... def generate(self, prompt: str) -> dict: ... image = self.pipe(prompt).images[0] ... return {"url": fal.toolkit.upload_image(image)}
- requirements¶
Pip packages to install in the environment. Supports standard pip syntax including version specifiers. Use a list of strings for a single install step, or a list of lists to install in multiple steps. Example: [“numpy==1.24.0”, “torch>=2.0.0”] or [[“setuptools”, “wheel”], [“numpy==1.24.0”]]
- local_python_modules¶
List of local Python module names to include in the deployment. Use for custom code not available on PyPI. Example: [“my_utils”, “models”]
- machine_type¶
Compute instance type for your application. CPU options: ‘XS’, ‘S’ (default), ‘M’, ‘L’. GPU options: ‘GPU-A6000’, ‘GPU-A100’, ‘GPU-H100’, ‘GPU-H200’, ‘GPU-B200’. Use a string for a single type, or a list to define fallback types (tried in order until one is available). Example: “GPU-A100” or [“GPU-H100”, “GPU-A100”]
- num_gpus¶
Number of GPUs to allocate. Only applies to GPU machine types.
- regions¶
Allowed regions for deployment. None means any region. Example: [“us-east”, “eu-west”]
- host_kwargs¶
Advanced configuration dictionary passed to the host. For internal use. Prefer using class attributes instead.
- app_name¶
Custom name for the application. Defaults to class name.
- app_auth¶
Authentication mode. Options: ‘private’ (API key required), ‘public’ (no auth), ‘shared’ (shareable link).
- app_files¶
List of files/directories to include in deployment. Example: [“./models”, “./config.yaml”]
- app_files_ignore¶
Regex patterns to exclude from deployment. Default excludes .pyc, __pycache__, .git, .DS_Store.
- app_files_context_dir¶
Base directory for resolving app_files paths. Defaults to the directory containing the app file.
- request_timeout¶
Maximum seconds for a single request. None for default.
- startup_timeout¶
Maximum seconds for app startup/setup. None for default.
- min_concurrency¶
Minimum warm instances to keep running. Set to 1+ to avoid cold starts. Default is 0 (scale to zero).
- max_concurrency¶
Maximum instances to scale up to.
- concurrency_buffer¶
Additional instances to keep warm above current load.
- concurrency_buffer_perc¶
Percentage buffer of instances above current load.
- scaling_delay¶
Seconds to wait for a request to be picked up by a runner before triggering a scale up. Useful for apps with slow startup times.
- max_multiplexing¶
Maximum concurrent requests per instance.
- kind¶
Deployment kind. For internal use.
- image¶
Custom container image for the application. Use ContainerImage to specify a Dockerfile.
-
app_auth:
ClassVar[Optional[Literal['public','private','shared']]] = None¶
-
app_files:
ClassVar[list[str]] = []¶
-
app_files_context_dir:
ClassVar[Optional[str]] = None¶
-
app_files_ignore:
ClassVar[list[str]] = ['\\.pyc$', '__pycache__/', '\\.git/', '\\.DS_Store$']¶
-
app_name:
ClassVar[Optional[str]] = None¶
- collect_routes()¶
- Return type:
dict[RouteSignature,Callable[...,Any]]
-
concurrency_buffer:
ClassVar[int|None] = None¶
-
concurrency_buffer_perc:
ClassVar[int|None] = None¶
- property current_request: RequestContext | None¶
- classmethod get_endpoints()¶
- Return type:
list[str]
- classmethod get_health_check_config()¶
- Return type:
Optional[ApplicationHealthCheckConfig]
- handle_exit()¶
Handle exit signal.
- health()¶
-
host_kwargs:
ClassVar[dict[str,Any]] = {'_scheduler': 'nomad', '_scheduler_options': {'storage_region': 'us-east'}, 'keep_alive': 60, 'resolver': 'uv'}¶
-
image:
ClassVar[Optional[ContainerImage]] = None¶
-
isolate_channel:
Channel|None= None¶
-
kind:
ClassVar[Optional[str]] = None¶
- lifespan(app)¶
-
local_file_path:
ClassVar[Optional[str]] = None¶
-
local_python_modules:
ClassVar[list[str]] = []¶
-
machine_type:
ClassVar[str|list[str]] = 'S'¶
-
max_concurrency:
ClassVar[int|None] = None¶
-
max_multiplexing:
ClassVar[int|None] = None¶
-
min_concurrency:
ClassVar[int|None] = None¶
-
num_gpus:
ClassVar[int|None] = None¶
- provide_hints()¶
Provide hints for routing the application.
- Return type:
list[str]
-
regions:
ClassVar[Optional[list[str]]] = None¶
-
request_timeout:
ClassVar[int|None] = None¶
-
requirements:
ClassVar[list[str] |list[list[str]]] = []¶
- classmethod run_local(*args, **kwargs)¶
-
scaling_delay:
ClassVar[int|None] = None¶
- setup()¶
Setup the application before serving.
-
skip_retry_conditions:
ClassVar[Optional[list[Literal['timeout','server_error','connection_error']]]] = None¶
- classmethod spawn()¶
- Return type:
-
startup_timeout:
ClassVar[int|None] = None¶
- teardown()¶
Teardown the application after serving.
-
termination_grace_period_seconds:
ClassVar[int|None] = None¶
- class fal.app.AppClient(cls, url, timeout=None)¶
Bases:
object- classmethod connect(cls, app_cls, *, health_request_timeout=30, startup_timeout=60, health_check_interval=0.5)¶
- health()¶
- exception fal.app.AppClientError(message, status_code, headers=<factory>)¶
Bases:
FalServerlessException-
headers:
dict[str,str]¶
-
message:
str¶
-
status_code:
int¶
-
headers:
- class fal.app.AppSpawnInfo(info)¶
Bases:
object- property application¶
- property future¶
- property logs¶
- property stream¶
- property url¶
- wait(*, health_request_timeout=30, startup_timeout=60, health_check_interval=0.5, headers=None)¶
- Return type:
None
- class fal.app.EndpointClient(url, endpoint, signature, timeout=None, headers=None)¶
Bases:
object
- class fal.app.RequestContext(request_id, endpoint, lifecycle_preference, headers)¶
Bases:
object-
endpoint:
str|None¶
-
headers:
Header¶
-
lifecycle_preference:
dict[str,str] |None¶
-
request_id:
str|None¶
-
endpoint:
- fal.app.endpoint(path, *, is_websocket=False, health_check=None)¶
Designate the decorated function as an application endpoint.
- Return type:
Callable[[TypeVar(EndpointT, bound=Callable[...,Any])],TypeVar(EndpointT, bound=Callable[...,Any])]
- async fal.app.open_isolate_channel(address)¶
- Return type:
Channel|None
- fal.app.wrap_app(cls, **kwargs)¶
- Return type:
fal.apps module¶
- class fal.apps.Completed(logs)¶
Bases:
_StatusIndicates the request has been completed successfully and the result is ready to be retrieved.
-
logs:
list[dict[str,Any]] |None¶
-
logs:
- class fal.apps.InProgress(logs)¶
Bases:
_StatusIndicates the request is now being actively processed, and provides runtime logs for the inference task.
-
logs:
list[dict[str,Any]] |None¶
-
logs:
- class fal.apps.Queued(position)¶
Bases:
_StatusIndicates the request is still in the queue, and provides the position in the queue for ETA calculation.
-
position:
int¶
-
position:
- class fal.apps.RequestHandle(app_id, request_id, _client=<factory>, _creds=<factory>)¶
Bases:
objectA handle to an async inference request.
-
app_id:
str¶
- cancel()¶
Cancel an async inference request.
- Return type:
None
- fetch_raw_response()¶
- Return type:
Response
- fetch_result()¶
Retrieve the result of an async inference request, raises an exception if the request is not completed yet.
- Return type:
dict[str,Any]
- get()¶
Retrieve the result of an async inference request, polling the status of the request until it is completed.
- Return type:
dict[str,Any]
- iter_events(*, logs=False, _RequestHandle__poll_delay=0.2)¶
Yield all events regarding the given task till its completed.
- Return type:
Iterator[_Status]
-
request_id:
str¶
- status(*, logs=False)¶
Check the status of an async inference request.
- Return type:
_Status
-
app_id:
- fal.apps.run(app_id, arguments, *, path='')¶
Run an inference task on a Fal app and return the result.
- Return type:
dict[str,Any]
- fal.apps.stream(app_id, arguments, *, path='')¶
Stream an inference task on a Fal app.
- Return type:
Iterator[str|bytes]
- fal.apps.submit(app_id, arguments, *, path='')¶
Submit an async inference task to the app. Returns a request handle which can be used to check the status of the request and retrieve the result.
- Return type:
- fal.apps.ws(app_id, *, path='')¶
Connect to a HTTP endpoint but with websocket protocol. This is an internal and experimental API, use it at your own risk.
- Return type:
Iterator[_WSConnection]
fal.compat module¶
- async fal.compat.run_in_thread(func, *args, **kwargs)¶
Run sync code on a worker thread with Python 3.8+ support.
- Return type:
Any
fal.config module¶
- class fal.config.Config(*, validate_profile=False, profile=None)¶
Bases:
object- DEFAULT_CONFIG_PATH = '~/.fal/config.toml'¶
- delete_profile(profile)¶
- Return type:
None
- get(key)¶
- Return type:
Optional[str]
- get_internal(key)¶
- Return type:
Optional[str]
- property profile: str | None¶
- profiles()¶
- Return type:
List[str]
- save()¶
- Return type:
None
- set(key, value)¶
- Return type:
None
- set_internal(key, value)¶
- Return type:
None
- unset(key)¶
- Return type:
None
- unset_internal(key)¶
- Return type:
None
fal.container module¶
- class fal.container.ContainerImage(dockerfile_str, build_args=<factory>, registries=<factory>, builder=None, compression='gzip', force_compression=False, secrets=<factory>, context_dir=PosixPath('/home/runner/work/fal/fal/projects/fal'), dockerignore=None, dockerignore_path=None)¶
Bases:
objectContainerImage represents a Docker image that can be built from a Dockerfile.
- add_dockerignore(patterns=None, path=None)¶
Add or update dockerignore patterns.
Sets the internal dockerignore patterns using gitignore-style matching. You can provide either a list of patterns or a path to a .dockerignore file.
- Parameters:
patterns (
Optional[List[str]]) – List of gitignore-style patternspath (
Optional[PathLike]) – Path to a .dockerignore file
- Raises:
ValueError – If both patterns and path are provided, or neither
- Return type:
None
-
build_args:
Dict[str,str]¶
-
builder:
Optional[Literal['depot','service','worker']] = None¶
-
compression:
str= 'gzip'¶
-
context_dir:
PathLike= PosixPath('/home/runner/work/fal/fal/projects/fal')¶
-
dockerfile_str:
str¶
-
dockerignore:
Optional[List[str]] = None¶
-
dockerignore_path:
Optional[PathLike] = None¶
-
force_compression:
bool= False¶
- classmethod from_dockerfile(path, **kwargs)¶
- Return type:
- classmethod from_dockerfile_str(text, **kwargs)¶
- Return type:
- get_copy_add_sources()¶
Get list of src paths/patterns from COPY/ADD commands. This method only parses the Dockerfile - it doesn’t access the filesystem.
- Return type:
List[str]- Returns:
List of src paths (e.g., [“src/”, “requirements.txt”, “*.py”]) that can be passed to FileSync.sync_files(). Returns empty list if no COPY/ADD commands found.
-
registries:
Dict[str,Dict[str,str]]¶
-
secrets:
Dict[str,str]¶
- to_dict()¶
- Return type:
dict
- class fal.container.DockerfileParser(content)¶
Bases:
object-
content:
str¶
-
normalized_content:
str¶
- parse_copy_add_sources()¶
- Parse COPY and ADD commands to extract source paths.
- Return type:
List[str]- Returns:
List of source paths/patterns referenced in COPY/ADD commands.
-
content:
- class fal.container.DockerignoreHandler(context_dir=None, dockerignore=None, dockerignore_path=None)¶
Bases:
object-
context_dir:
Optional[PathLike] = None¶
-
dockerignore:
Optional[List[str]] = None¶
-
dockerignore_path:
Optional[PathLike] = None¶
- get_patterns()¶
Get list of ignore patterns.
Priority (highest to lowest): 1. Explicit dockerignore list 2. Explicit path to the .dockerignore file 3. .dockerignore file in the context directory 4. Default ignore patterns
- Return type:
List[str]- Returns:
List of ignore patterns
- get_regex_patterns()¶
- Return type:
List[str]
-
context_dir:
fal.file_sync module¶
- class fal.file_sync.FileMetadata(size, mtime, mode, hash, relative_path, absolute_path)¶
Bases:
object-
absolute_path:
str¶
- classmethod from_path(file_path, *, relative, absolute)¶
- Return type:
-
hash:
str¶
-
mode:
int¶
-
mtime:
float¶
-
relative_path:
str¶
-
size:
int¶
- to_dict()¶
- Return type:
Dict[str,str]
-
absolute_path:
- class fal.file_sync.FileSync(local_file_path)¶
Bases:
object- check_hashes_on_server(hashes)¶
- Return type:
List[str]
- close()¶
- collect_files(paths, files_context_dir=None)¶
- sync_files(paths, chunk_size=10485760, max_concurrency_uploads=10, files_ignore=[], files_context_dir=None)¶
- Return type:
Tuple[List[FileMetadata],List[AppFileUploadException]]
- upload_file_multipart(file_path, metadata, chunk_size=10485760)¶
- Return type:
str
- class fal.file_sync.FileSyncOptions(files_list, files_ignore, files_context_dir)¶
Bases:
object-
files_context_dir:
Optional[str]¶
-
files_ignore:
List[Pattern]¶
-
files_list:
List[str]¶
- classmethod from_options(options)¶
- Return type:
-
files_context_dir:
- fal.file_sync.compute_hash(file_path, mode)¶
- Return type:
str
- fal.file_sync.normalize_path(path_str, base_path_str, files_context_dir=None)¶
- Return type:
Tuple[str,str]
- fal.file_sync.print_path_tree(file_paths)¶
- fal.file_sync.sanitize_relative_path(rel_path, original_path)¶
- Return type:
str
fal.files module¶
- class fal.files.FalFileSystem(*, host=None, team=None, profile=None, **kwargs)¶
Bases:
AbstractFileSystem- get_file(rpath, lpath, **kwargs)¶
Copy single remote file to local
- info(path, **kwargs)¶
Give details of entry at path
Returns a single dictionary, with exactly the same information as
lswould withdetail=True.The default implementation calls ls and could be overridden by a shortcut. kwargs are passed on to
`ls().Some file systems might not be able to measure the file’s size, in which case, the returned dict will include
'size': None.- Returns:
dict with keys (name (full path in the FS), size (in bytes), type (file,)
directory, or something else) and other FS-specific keys.
- ls(path, detail=True, **kwargs)¶
List objects at path.
This should include subdirectories and files at that location. The difference between a file and a directory must be clear when details are requested.
The specific keys, or perhaps a FileInfo class, or similar, is TBD, but must be consistent across implementations. Must include:
full path to the entry (without protocol)
size of the entry, in bytes. If the value cannot be determined, will be
None.type of entry, “file”, “directory” or other
Additional information may be present, appropriate to the file-system, e.g., generation, checksum, etc.
May use refresh=True|False to allow use of self._ls_from_cache to check for a saved listing and avoid calling the backend. This would be common where listing may be expensive.
- Parameters:
path (str)
detail (bool) – if True, gives a list of dictionaries, where each is the same as the result of
info(path). If False, gives a list of paths (str).kwargs (may have additional backend-specific options, such as version) – information
- Returns:
List of strings if detail is False, or list of directory information
dicts if detail is True.
- mv(path1, path2, recursive=False, maxdepth=None, **kwargs)¶
Move file(s) from one location to another
- put_file(lpath, rpath, mode='overwrite', **kwargs)¶
Copy single file to remote
- put_file_from_url(url, rpath, mode='overwrite', **kwargs)¶
- rename(path, destination, **kwargs)¶
Alias of AbstractFileSystem.mv.
- rm(path, **kwargs)¶
Delete files.
- Parameters:
path (str or list of str) – File(s) to delete.
recursive (bool) – If file(s) are directories, recursively delete contents and then also remove the directory
maxdepth (int or None) – Depth to pass to walk for finding files to delete, if recursive. If None, there will be no limit and infinite recursion may be possible.
fal.flags module¶
- fal.flags.bool_envvar(name)¶
fal.project module¶
- fal.project.find_project_root(srcs)¶
Return a directory containing .git, or pyproject.toml.
That directory will be a common parent of all files and directories passed in srcs.
If no directory in the tree contains a marker that would specify it’s the project root, the root of the file system is returned.
Returns a two-tuple with the first element as the project root path and the second element as a string describing the method by which the project root was discovered.
- Return type:
Tuple[Path,str]
- fal.project.find_pyproject_toml(path_search_start=None)¶
Find the absolute filepath to a pyproject.toml if it exists
- Return type:
Optional[str]
- fal.project.parse_pyproject_toml(path_config)¶
Parse a pyproject toml file, pulling out relevant parts for fal.
If parsing fails, will raise a tomli.TOMLDecodeError.
- Return type:
Dict[str,Any]
fal.realtime module¶
- fal.realtime.msgpack_decode_message(message)¶
- Return type:
Any
- fal.realtime.msgpack_encode_message(message)¶
- Return type:
bytes
- fal.realtime.realtime(path, *, buffering=None, session_timeout=None, input_modal=<object object>, output_modal=<object object>, max_batch_size=1, content_type='application/msgpack', encode_message=None, decode_message=None)¶
Designate the decorated function as a realtime application endpoint.
- Return type:
Callable[[TypeVar(EndpointT, bound=Callable[...,Any])],TypeVar(EndpointT, bound=Callable[...,Any])]
fal.ref module¶
- fal.ref.set_current_app(app)¶
fal.sdk module¶
- class fal.sdk.AliasInfo(alias, revision, auth_mode, keep_alive, max_concurrency, max_multiplexing, active_runners, min_concurrency, concurrency_buffer, concurrency_buffer_perc, scaling_delay, machine_types, request_timeout, startup_timeout, valid_regions, environment_name=None)¶
Bases:
object-
active_runners:
int¶
-
alias:
str¶
-
auth_mode:
str¶
-
concurrency_buffer:
int¶
-
concurrency_buffer_perc:
int¶
-
environment_name:
str|None= None¶
-
keep_alive:
int¶
-
machine_types:
list[str]¶
-
max_concurrency:
int¶
-
max_multiplexing:
int¶
-
min_concurrency:
int¶
-
request_timeout:
int¶
-
revision:
str¶
-
scaling_delay:
int¶
-
startup_timeout:
int¶
-
valid_regions:
list[str]¶
-
active_runners:
- class fal.sdk.ApplicationHealthCheckConfig(path, start_period_seconds, timeout_seconds, failure_threshold, call_regularly)¶
Bases:
object-
call_regularly:
Optional[bool]¶
-
failure_threshold:
Optional[int]¶
-
path:
str¶
-
start_period_seconds:
Optional[int]¶
-
timeout_seconds:
Optional[int]¶
-
call_regularly:
- class fal.sdk.ApplicationInfo(application_id, keep_alive, max_concurrency, max_multiplexing, active_runners, min_concurrency, concurrency_buffer, concurrency_buffer_perc, scaling_delay, machine_types, request_timeout, startup_timeout, valid_regions, created_at, environment_name=None)¶
Bases:
object-
active_runners:
int¶
-
application_id:
str¶
-
concurrency_buffer:
int¶
-
concurrency_buffer_perc:
int¶
-
created_at:
datetime¶
-
environment_name:
str|None= None¶
-
keep_alive:
int¶
-
machine_types:
list[str]¶
-
max_concurrency:
int¶
-
max_multiplexing:
int¶
-
min_concurrency:
int¶
-
request_timeout:
int¶
-
scaling_delay:
int¶
-
startup_timeout:
int¶
-
valid_regions:
list[str]¶
-
active_runners:
- class fal.sdk.AuthenticatedCredentials(user=<factory>, team=None)¶
Bases:
Credentials-
team:
str|None= None¶
- to_grpc()¶
- Return type:
ChannelCredentials
- to_headers()¶
- Return type:
dict[str,str]
-
user:
UserAccess¶
-
team:
- class fal.sdk.Credentials¶
Bases:
object-
server_credentials:
ServerCredentials= <fal.sdk.RemoteCredentials object>¶
- to_grpc()¶
- Return type:
ChannelCredentials
- to_headers()¶
- Return type:
dict[str,str]
-
server_credentials:
- class fal.sdk.DeploymentStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum- RECREATE = 'recreate'¶
- ROLLING = 'rolling'¶
- static from_proto(proto)¶
- Return type:
- to_proto()¶
- Return type:
int
- class fal.sdk.EnvironmentInfo(name, description, is_default, created_at)¶
Bases:
object-
created_at:
datetime¶
-
description:
str|None¶
-
is_default:
bool¶
-
name:
str¶
-
created_at:
- class fal.sdk.FalServerlessClient(hostname, credentials=<factory>)¶
Bases:
object- connect()¶
- Return type:
-
credentials:
Credentials¶
-
hostname:
str¶
- class fal.sdk.FalServerlessConnection(hostname, credentials, _stack=<factory>, _stub=None)¶
Bases:
object- close()¶
- create_environment(name, description=None)¶
- Return type:
- create_user_key(scope, alias)¶
- Return type:
tuple[str,str]
-
credentials:
Credentials¶
- define_environment(kind, force=False, **options)¶
- Return type:
EnvironmentDefinition
- delete_alias(alias, *, environment_name=None)¶
- Return type:
str|None
- delete_application(application_id)¶
- Return type:
None
- delete_environment(name)¶
- Return type:
None
- delete_secret(name, *, environment_name=None)¶
- Return type:
None
-
hostname:
str¶
- kill_runner(runner_id)¶
- Return type:
None
- list_alias_runners(alias, *, list_pending=True, start_time=None, environment_name=None)¶
- Return type:
list[RunnerInfo]
- list_applications(application_name=None, *, environment_name=None)¶
- Return type:
list[ApplicationInfo]
- list_environments()¶
- Return type:
list[EnvironmentInfo]
- list_runners(start_time=None)¶
- Return type:
list[RunnerInfo]
- list_secrets(*, environment_name=None)¶
- Return type:
list[ServerlessSecret]
- list_user_keys()¶
- Return type:
list[UserKeyInfo]
- register(function, environments, application_name=None, auth_mode=None, *, source_code=None, health_check_config=None, serialization_method='cloudpickle', machine_requirements=None, metadata=None, deployment_strategy, scale=True, private_logs=False, files=None, skip_retry_conditions=None, environment_name=None, termination_grace_period_seconds=None)¶
- Return type:
Iterator[RegisterApplicationResult]
- revoke_user_key(key_id)¶
- Return type:
None
- rollout_application(application_name, force=False, *, environment_name=None)¶
- Return type:
None
- run(function, environments, *, serialization_method='cloudpickle', machine_requirements=None, setup_function=None, files=None, application_name=None, auth_mode=None, environment_name=None)¶
- Return type:
Iterator[HostedRunResult[TypeVar(ResultT)]]
- scale(application_name, max_concurrency=None)¶
- Return type:
None
- set_secret(name, value, *, environment_name=None)¶
- Return type:
None
- stop_runner(runner_id, replace_first=False)¶
- Return type:
None
- property stub: IsolateControllerStub¶
- update_application(application_name, keep_alive=None, max_multiplexing=None, max_concurrency=None, min_concurrency=None, concurrency_buffer=None, concurrency_buffer_perc=None, scaling_delay=None, request_timeout=None, startup_timeout=None, valid_regions=None, machine_types=None, *, environment_name=None)¶
- Return type:
- class fal.sdk.FalServerlessKeyCredentials(key_id, key_secret)¶
Bases:
Credentials-
key_id:
str¶
-
key_secret:
str¶
- to_grpc()¶
- Return type:
ChannelCredentials
- to_headers()¶
- Return type:
dict[str,str]
-
key_id:
- class fal.sdk.HealthCheck(*, start_period_seconds=None, timeout_seconds=None, failure_threshold=None, call_regularly=None)¶
Bases:
object-
call_regularly:
Optional[bool] = None¶
-
failure_threshold:
Optional[int] = None¶
-
start_period_seconds:
Optional[int] = None¶
-
timeout_seconds:
Optional[int] = None¶
-
call_regularly:
- class fal.sdk.HostedRunResult(run_id, status, logs=<factory>, result=None, stream=None, service_urls=None)¶
Bases:
Generic[ResultT]-
logs:
list[Log]¶
-
result:
Optional[TypeVar(ResultT)] = None¶
-
run_id:
str¶
-
service_urls:
ServiceURLs|None= None¶
-
status:
HostedRunStatus¶
-
stream:
Any= None¶
-
logs:
- class fal.sdk.HostedRunState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum- INTERNAL_FAILURE = 2¶
- IN_PROGRESS = 0¶
- SUCCESS = 1¶
- class fal.sdk.HostedRunStatus(state)¶
Bases:
object-
state:
HostedRunState¶
-
state:
- class fal.sdk.KeyScope(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum- ADMIN = 'ADMIN'¶
- API = 'API'¶
- class fal.sdk.LocalCredentials¶
Bases:
ServerCredentials- to_grpc()¶
- Return type:
ChannelCredentials
- class fal.sdk.MachineRequirements(machine_types, num_gpus=None, keep_alive=10, base_image=None, exposed_port=None, scheduler=None, scheduler_options=None, max_concurrency=None, max_multiplexing=None, min_concurrency=None, concurrency_buffer=None, concurrency_buffer_perc=None, scaling_delay=None, request_timeout=None, startup_timeout=None, valid_regions=None)¶
Bases:
object-
base_image:
str|None= None¶
-
concurrency_buffer:
int|None= None¶
-
concurrency_buffer_perc:
int|None= None¶
-
exposed_port:
int|None= None¶
-
keep_alive:
int= 10¶
-
machine_types:
list[str]¶
-
max_concurrency:
int|None= None¶
-
max_multiplexing:
int|None= None¶
-
min_concurrency:
int|None= None¶
-
num_gpus:
int|None= None¶
-
request_timeout:
int|None= None¶
-
scaling_delay:
int|None= None¶
-
scheduler:
str|None= None¶
-
scheduler_options:
dict[str,Any] |None= None¶
-
startup_timeout:
int|None= None¶
-
valid_regions:
list[str] |None= None¶
-
base_image:
- class fal.sdk.RegisterApplicationResult(result, logs=<factory>, service_urls=None)¶
Bases:
object-
logs:
list[Log]¶
-
result:
RegisterApplicationResultType|None¶
-
service_urls:
ServiceURLs|None= None¶
-
logs:
- class fal.sdk.RemoteCredentials¶
Bases:
ServerCredentials- to_grpc()¶
- Return type:
ChannelCredentials
- class fal.sdk.ReplaceState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum- DID_REPLACE = 'DID_REPLACE'¶
- NO_REPLACE = 'NO_REPLACE'¶
- WILL_REPLACE = 'WILL_REPLACE'¶
- class fal.sdk.RunnerInfo(runner_id, in_flight_requests, expiration_countdown, uptime, external_metadata, revision, alias, state, machine_type, replacement=ReplaceState.NO_REPLACE)¶
Bases:
object-
alias:
str¶
-
expiration_countdown:
Optional[int]¶
-
external_metadata:
dict[str,Any]¶
-
in_flight_requests:
int¶
-
machine_type:
str¶
-
replacement:
ReplaceState= 'NO_REPLACE'¶
-
revision:
str¶
-
runner_id:
str¶
-
state:
RunnerState¶
-
uptime:
timedelta¶
-
alias:
- class fal.sdk.RunnerState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum- DEAD = 'DEAD'¶
- DOCKER_PULL = 'DOCKER_PULL'¶
- DRAINING = 'DRAINING'¶
- FAILURE_DELAY = 'FAILURE_DELAY'¶
- IDLE = 'IDLE'¶
- PENDING = 'PENDING'¶
- RUNNING = 'RUNNING'¶
- SETUP = 'SETUP'¶
- TERMINATED = 'TERMINATED'¶
- TERMINATING = 'TERMINATING'¶
- class fal.sdk.ServerCredentials¶
Bases:
object- property base_options: dict[str, str | int]¶
- to_grpc()¶
- Return type:
ChannelCredentials
- class fal.sdk.ServerlessSecret(name, created_at, environment_name=None)¶
Bases:
object-
created_at:
datetime¶
-
environment_name:
str|None= None¶
-
name:
str¶
-
created_at:
- class fal.sdk.ServiceURLs(playground, run, queue, ws, log)¶
Bases:
object-
log:
str¶
-
playground:
str¶
-
queue:
str¶
-
run:
str¶
-
ws:
str¶
-
log:
- class fal.sdk.UserKeyInfo(key_id, created_at, scope, alias)¶
Bases:
object-
alias:
str¶
-
created_at:
datetime¶
-
key_id:
str¶
-
alias:
- class fal.sdk.WorkerStatus(worker_id, start_time, end_time, duration, user_id, machine_type)¶
Bases:
object-
duration:
timedelta¶
-
end_time:
datetime¶
-
machine_type:
str¶
-
start_time:
datetime¶
-
user_id:
str¶
-
worker_id:
str¶
-
duration:
- fal.sdk.construct_alias(base_name, environment_name=None)¶
Construct the full alias with environment suffix.
Examples: - (“my-app”, None) → “my-app” - (“my-app”, “main”) → “my-app” - (“my-app”, “staging”) → “my-app–staging”
- Return type:
str
- fal.sdk.deconstruct_alias(full_alias, environment_name=None)¶
Extract base name from full alias for display.
Examples: - (“my-app–staging”, “staging”) → “my-app” - (“my-app”, “main”) → “my-app” - (“my-app”, None) → “my-app”
- Return type:
str
- fal.sdk.get_agent_credentials(original_credentials)¶
If running inside a fal Serverless box, use the preconfigured credentials instead of the user provided ones.
- Return type:
- fal.sdk.get_credentials(team=None, key=None, profile=None)¶
- Return type:
- fal.sdk.get_default_server_credentials()¶
- Return type:
fal.sync module¶
- fal.sync.sync_dir(local_dir, remote_dir, force_upload=False)¶
- Return type:
str
fal.upload module¶
- class fal.upload.AppFileMultipartUpload(client, file_hash, metadata, chunk_size=10485760, max_concurrency=10)¶
Bases:
BaseMultipartUpload- property cancel_url: str | None¶
- property complete_url: str¶
- get_complete_payload(parts)¶
- Return type:
dict
- get_initiate_payload()¶
- Return type:
Optional[dict]
- property initiate_url: str¶
- property part_url: str¶
- class fal.upload.BaseMultipartUpload(client, chunk_size=10485760, max_concurrency=10)¶
Bases:
object- cancel()¶
- Return type:
None
- property cancel_url: str | None¶
- complete()¶
- Return type:
str
- property complete_url: str¶
- get_complete_payload(parts)¶
- Return type:
dict
- get_initiate_payload()¶
- Return type:
Optional[dict]
- initiate()¶
- Return type:
str
- property initiate_url: str¶
- property part_url: str¶
- upload_file(file_path, on_part_complete=None)¶
- Return type:
str
- property upload_id: str¶
fal.utils module¶
- class fal.utils.LoadedFunction(function, endpoints, app_name, app_auth, source_code, class_name=None)¶
Bases:
object-
app_auth:
Optional[Literal['public','private','shared']]¶
-
app_name:
str|None¶
-
class_name:
str|None= None¶
-
endpoints:
list[str]¶
-
function:
IsolatedFunction¶
-
source_code:
str|None¶
-
app_auth:
- fal.utils.load_function_from(host, file_path, function_name=None, *, force_env_build=False, options=None, app_name=None, app_auth=None, limit_max_requests=None)¶
- Return type:
fal.workflows module¶
- class fal.workflows.AttributeLeaf(leaf, attribute)¶
Bases:
Leaf-
attribute:
str¶
- property referee: ReferenceLeaf¶
-
attribute:
- class fal.workflows.Context(vars)¶
Bases:
object
- class fal.workflows.Display(id, depends, fields)¶
Bases:
Node- to_json()¶
- Return type:
dict[str,Any]
- class fal.workflows.IndexLeaf(leaf, index)¶
Bases:
Leaf-
index:
int¶
- property referee: ReferenceLeaf¶
-
index:
- class fal.workflows.Leaf¶
Bases:
object- property referee: ReferenceLeaf¶
- exception fal.workflows.MisconfiguredGraphError¶
Bases:
WorkflowSyntaxError
- class fal.workflows.Node(id, depends)¶
Bases:
object-
depends:
set[str]¶
-
id:
str¶
- to_json()¶
- Return type:
dict[str,Any]
-
depends:
- class fal.workflows.ReferenceLeaf(id)¶
Bases:
Leaf-
id:
str¶
- property referee: ReferenceLeaf¶
-
id:
- class fal.workflows.Run(id, depends, app, input)¶
Bases:
Node-
app:
str¶
- to_json()¶
- Return type:
dict[str,Any]
-
app:
- class fal.workflows.Workflow(name, input_schema, output_schema, nodes=<factory>, output=None, _app_counter=<factory>)¶
Bases:
object- display(*fields)¶
- Return type:
None
- property input: ReferenceLeaf¶
-
input_schema:
Dict[str,Any]¶
-
name:
str¶
-
output:
dict[str,Any] |None= None¶
-
output_schema:
Dict[str,Any]¶
- publish(title, *, is_public=True)¶
- run(app, input)¶
- Return type:
- set_output(output)¶
- Return type:
None
- exception fal.workflows.WorkflowSyntaxError¶
Bases:
FalServerlessException
- fal.workflows.depends(data)¶
- Return type:
set[str]
- fal.workflows.export_workflow_json(data)¶
- Return type:
Union[Dict[str,Any],List[Any],str,int,float,bool,None,Leaf]
- fal.workflows.import_workflow_json(data)¶
- Return type:
Union[Dict[str,Any],List[Any],str,int,float,bool,None,Leaf]
- fal.workflows.iter_leaves(data)¶
- Return type:
Iterator[Union[Dict[str,Any],List[Any],str,int,float,bool,None,Leaf]]
- fal.workflows.main()¶
- Return type:
None
Module contents¶
- class fal.App(*, _allow_init=False)¶
Bases:
BaseServableCreate a fal serverless application.
Subclass this to define your application with custom setup, endpoints, and configuration. The App class handles model loading, request routing, and lifecycle management.
Example
>>> class TextToImage(fal.App, machine_type="GPU"): ... requirements = ["diffusers", "torch"] ... ... def setup(self): ... self.pipe = StableDiffusionPipeline.from_pretrained( ... "runwayml/stable-diffusion-v1-5" ... ) ... ... @fal.endpoint("/") ... def generate(self, prompt: str) -> dict: ... image = self.pipe(prompt).images[0] ... return {"url": fal.toolkit.upload_image(image)}
- requirements¶
Pip packages to install in the environment. Supports standard pip syntax including version specifiers. Use a list of strings for a single install step, or a list of lists to install in multiple steps. Example: [“numpy==1.24.0”, “torch>=2.0.0”] or [[“setuptools”, “wheel”], [“numpy==1.24.0”]]
- local_python_modules¶
List of local Python module names to include in the deployment. Use for custom code not available on PyPI. Example: [“my_utils”, “models”]
- machine_type¶
Compute instance type for your application. CPU options: ‘XS’, ‘S’ (default), ‘M’, ‘L’. GPU options: ‘GPU-A6000’, ‘GPU-A100’, ‘GPU-H100’, ‘GPU-H200’, ‘GPU-B200’. Use a string for a single type, or a list to define fallback types (tried in order until one is available). Example: “GPU-A100” or [“GPU-H100”, “GPU-A100”]
- num_gpus¶
Number of GPUs to allocate. Only applies to GPU machine types.
- regions¶
Allowed regions for deployment. None means any region. Example: [“us-east”, “eu-west”]
- host_kwargs¶
Advanced configuration dictionary passed to the host. For internal use. Prefer using class attributes instead.
- app_name¶
Custom name for the application. Defaults to class name.
- app_auth¶
Authentication mode. Options: ‘private’ (API key required), ‘public’ (no auth), ‘shared’ (shareable link).
- app_files¶
List of files/directories to include in deployment. Example: [“./models”, “./config.yaml”]
- app_files_ignore¶
Regex patterns to exclude from deployment. Default excludes .pyc, __pycache__, .git, .DS_Store.
- app_files_context_dir¶
Base directory for resolving app_files paths. Defaults to the directory containing the app file.
- request_timeout¶
Maximum seconds for a single request. None for default.
- startup_timeout¶
Maximum seconds for app startup/setup. None for default.
- min_concurrency¶
Minimum warm instances to keep running. Set to 1+ to avoid cold starts. Default is 0 (scale to zero).
- max_concurrency¶
Maximum instances to scale up to.
- concurrency_buffer¶
Additional instances to keep warm above current load.
- concurrency_buffer_perc¶
Percentage buffer of instances above current load.
- scaling_delay¶
Seconds to wait for a request to be picked up by a runner before triggering a scale up. Useful for apps with slow startup times.
- max_multiplexing¶
Maximum concurrent requests per instance.
- kind¶
Deployment kind. For internal use.
- image¶
Custom container image for the application. Use ContainerImage to specify a Dockerfile.
-
app_auth:
ClassVar[Optional[Literal['public','private','shared']]] = None¶
-
app_files:
ClassVar[list[str]] = []¶
-
app_files_context_dir:
ClassVar[Optional[str]] = None¶
-
app_files_ignore:
ClassVar[list[str]] = ['\\.pyc$', '__pycache__/', '\\.git/', '\\.DS_Store$']¶
-
app_name:
ClassVar[Optional[str]] = None¶
- collect_routes()¶
- Return type:
dict[RouteSignature,Callable[...,Any]]
-
concurrency_buffer:
ClassVar[int|None] = None¶
-
concurrency_buffer_perc:
ClassVar[int|None] = None¶
- property current_request: RequestContext | None¶
- classmethod get_endpoints()¶
- Return type:
list[str]
- classmethod get_health_check_config()¶
- Return type:
Optional[ApplicationHealthCheckConfig]
- handle_exit()¶
Handle exit signal.
- health()¶
-
host_kwargs:
ClassVar[dict[str,Any]] = {'_scheduler': 'nomad', '_scheduler_options': {'storage_region': 'us-east'}, 'keep_alive': 60, 'resolver': 'uv'}¶
-
image:
ClassVar[Optional[ContainerImage]] = None¶
-
isolate_channel:
Channel|None= None¶
-
kind:
ClassVar[Optional[str]] = None¶
- lifespan(app)¶
-
local_file_path:
ClassVar[Optional[str]] = None¶
-
local_python_modules:
ClassVar[list[str]] = []¶
-
machine_type:
ClassVar[str|list[str]] = 'S'¶
-
max_concurrency:
ClassVar[int|None] = None¶
-
max_multiplexing:
ClassVar[int|None] = None¶
-
min_concurrency:
ClassVar[int|None] = None¶
-
num_gpus:
ClassVar[int|None] = None¶
- provide_hints()¶
Provide hints for routing the application.
- Return type:
list[str]
-
regions:
ClassVar[Optional[list[str]]] = None¶
-
request_timeout:
ClassVar[int|None] = None¶
-
requirements:
ClassVar[list[str] |list[list[str]]] = []¶
- classmethod run_local(*args, **kwargs)¶
-
scaling_delay:
ClassVar[int|None] = None¶
- setup()¶
Setup the application before serving.
-
skip_retry_conditions:
ClassVar[Optional[list[Literal['timeout','server_error','connection_error']]]] = None¶
- classmethod spawn()¶
- Return type:
-
startup_timeout:
ClassVar[int|None] = None¶
- teardown()¶
Teardown the application after serving.
-
termination_grace_period_seconds:
ClassVar[int|None] = None¶
- class fal.ContainerImage(dockerfile_str, build_args=<factory>, registries=<factory>, builder=None, compression='gzip', force_compression=False, secrets=<factory>, context_dir=PosixPath('/home/runner/work/fal/fal/projects/fal'), dockerignore=None, dockerignore_path=None)¶
Bases:
objectContainerImage represents a Docker image that can be built from a Dockerfile.
- add_dockerignore(patterns=None, path=None)¶
Add or update dockerignore patterns.
Sets the internal dockerignore patterns using gitignore-style matching. You can provide either a list of patterns or a path to a .dockerignore file.
- Parameters:
patterns (
Optional[List[str]]) – List of gitignore-style patternspath (
Optional[PathLike]) – Path to a .dockerignore file
- Raises:
ValueError – If both patterns and path are provided, or neither
- Return type:
None
-
build_args:
Dict[str,str]¶
-
builder:
Optional[Literal['depot','service','worker']] = None¶
-
compression:
str= 'gzip'¶
-
context_dir:
PathLike= PosixPath('/home/runner/work/fal/fal/projects/fal')¶
-
dockerfile_str:
str¶
-
dockerignore:
Optional[List[str]] = None¶
-
dockerignore_path:
Optional[PathLike] = None¶
-
force_compression:
bool= False¶
- classmethod from_dockerfile(path, **kwargs)¶
- Return type:
- classmethod from_dockerfile_str(text, **kwargs)¶
- Return type:
- get_copy_add_sources()¶
Get list of src paths/patterns from COPY/ADD commands. This method only parses the Dockerfile - it doesn’t access the filesystem.
- Return type:
List[str]- Returns:
List of src paths (e.g., [“src/”, “requirements.txt”, “*.py”]) that can be passed to FileSync.sync_files(). Returns empty list if no COPY/ADD commands found.
-
registries:
Dict[str,Dict[str,str]]¶
-
secrets:
Dict[str,str]¶
- to_dict()¶
- Return type:
dict
- class fal.FalServerlessKeyCredentials(key_id, key_secret)¶
Bases:
Credentials-
key_id:
str¶
-
key_secret:
str¶
- to_grpc()¶
- Return type:
ChannelCredentials
- to_headers()¶
- Return type:
dict[str,str]
-
key_id:
- class fal.HealthCheck(*, start_period_seconds=None, timeout_seconds=None, failure_threshold=None, call_regularly=None)¶
Bases:
object-
call_regularly:
Optional[bool] = None¶
-
failure_threshold:
Optional[int] = None¶
-
start_period_seconds:
Optional[int] = None¶
-
timeout_seconds:
Optional[int] = None¶
-
call_regularly:
- fal.cached(func)¶
Cache the result of the given function in-memory.
- Return type:
Callable[[ParamSpec(ArgsT)],TypeVar(ReturnT, covariant=True)]
- fal.endpoint(path, *, is_websocket=False, health_check=None)¶
Designate the decorated function as an application endpoint.
- Return type:
Callable[[TypeVar(EndpointT, bound=Callable[...,Any])],TypeVar(EndpointT, bound=Callable[...,Any])]
- fal.function(kind='virtualenv', *, host=None, local_python_modules=None, **config)¶
- fal.realtime(path, *, buffering=None, session_timeout=None, input_modal=<object object>, output_modal=<object object>, max_batch_size=1, content_type='application/msgpack', encode_message=None, decode_message=None)¶
Designate the decorated function as a realtime application endpoint.
- Return type:
Callable[[TypeVar(EndpointT, bound=Callable[...,Any])],TypeVar(EndpointT, bound=Callable[...,Any])]
- fal.sync_dir(local_dir, remote_dir, force_upload=False)¶
- Return type:
str