HPC System Admin
Company: NR Consulting LLC
Location: Austin
Posted on: January 26, 2023
|
|
Job Description:
Job Title: HPC System Admin
Work Location: Austin, TX
Position Type: Contract with possible extension
Duration: 12 + Months
Job Description:
Project Details:
Responsible for architecting and implementing Linux High
Performance Computing (HPC) clusters. Performs system architecture
duties on a Linux High performance computing (HPC) cluster
including cluster management, virtualization, cluster usage
monitoring, health monitoring, job scheduling, application
integration/installation (open source as well as vendor supported),
and application performance. Improve cluster performance through
kernel changes, firmware updates, library stack changes, and
application container management such as docker.
Mandatory Skills and Technologies, framework, and
Methodologies:
Knowledge of Linux and UNIX operating systems, including scripting
and programming proficiencies.
Experience with cloud bursting technologies.
Knowledge of cloud services like AWS SCOCA, Parallel Cluster, and
Azure CycleCloud
Knowledge of HPC tools and storage: AWS Elastic Fabric Adapter,
Azure ANF, Apache Spark, or Apache Ignite, Lustre, BeeFS
Demonstrate experience in programming system maintenance tasks in
C, Java, Perl, batch/shell, or another general-purpose programming
language.
Knowledge of NUMA and understanding of NUMA related APIs.
Be able to perform complex performance analysis including system
processes, I/O subsystems, networks and other related
components.
Must have experience with multi-threading and parallel processing
tools and environments.
Must have experience as a systems administrator. Must have advanced
ability to analyze complex IT systems.
Experience with high-performance servers and associated
high-performance networks.
Experience installing and maintaining clustered environments,
including automated installation methods.
Knowledge of common server hardware architectures including servers
(CPU, bus, memory), SANS, disk arrays, network hardware.
Understanding of Red Hat Linux Operating system including
processes, files, memory management and I/O systems; networking
services and protocols (e.g., TCP/IP, SSL, FTP, Telnet, LDAP).
Understanding of IP networking, basic routing, TCP ports and
network services, including SSH, LDAP, SFTP and HTTP(S). Ability to
design, promote, and implement change control and configuration
management, patch management, high availability systems, structured
design and support methodologies.
Must be organized with a strong ability to deliver tasks on time,
manage multiple efforts and be able to work with minimal
supervision.
Demonstrated ability to proactively learn, adapt to and use new
hardware/software technologies.
Good to have skills, Technologies, framework, and Methodologies
Performs system administration duties on a linux HPC Cluster,
cluster management, virtualization, cluster usage monitoring,
health monitoring, job scheduling, and application
integration/installation.
Responsible for system implementation/integration and systems
performance analysis.
Manages hardware and software applications in the production
environment provided to HPC users.
Install software and updates
Coordinates with vendors to resolve hardware and software problems
in HPC Cluster.
Facilitates the acquisition of hardware and software products and
services for the HPC Cluster.
Knowledge of LSF or other open-source job schedulers.
Compile, configure, and integrate open source applications into HPC
environment.
Able to learn and use internal software systems.
Monitors the availability of patches and updates and evaluates the
importance to the environment and schedules installations
accordingly.
Keeps abreast of the latest HPC hardware and software technology,
evaluating technologies as needed.
Designs, implements and administers high performance computing
cluster, performing proof of concepts such as software containers
(ex. Docker).
Interacts effectively with a broad range of colleagues such as
Applied Materials researchers and other IT staff.
Other duties may be assigned.
Keywords: NR Consulting LLC, Austin , HPC System Admin, Other , Austin, Texas
Click
here to apply!
|