This role supports the First-to-Know capability of the Technical Operations Center and requires coordination and multi-tasking skills to ensure maximum service availability of our systems
TOC Engineers monitor the performance and capacity of enterprise-wide systems using a variety of tools to identify hardware, software, and environmental alerts
Provide eyes on glass monitoring of Client systems and will investigate, verify, report, communicate and escalate any issues.
Author communications that are sent out to the CF user community and CF leadership about outages, upgrades, IT challenges, etc.
Candidates will also work with the technical teams to write up outage summaries and lessons learned reports for senior management to understand the impact to the CF community and corrections to avoid future occurrences.
You will have the opportunity to work across a wide variety of IT issues and learn about many different technologies; with the opportunity to grow into more advanced roles.
This team provides 24x7x365 support to the CF user community.
Lead kickoff meetings with various teams to determine what is needed to perform monitoring.
Create accurate process documents that can be followed to performed various tasks within the TOC.
Managing a ticket queue in ServiceNow and resolving tickets
This role will require shift work. The Client Technology Operation Center covers a 24/7 operation and members are asked to be flexible in providing coverage outside of their normal shift hours, when the need arises. Position is for full time employment and can be performed fully remote.
Responsibilities include:
Provide eyes on glass monitoring using various monitoring tools such as Dynatrace, Splunk, SCOM, ITM, and SolarWinds
Investigate and verify alerts and reported issues
Escalate issues to the Tier 2 operations team when necessary
Access devices and analyze graphing
Review device logs documentation and analysis
Perform real time monitoring of vital systems
Provide general event management and communication management support for the TOC
Support a 24x7 system monitoring service to proactively identify and assess problems before the customer reports them
Support response time -ensuring system information, contact information and processes are in place to coordinate the necessary IT response to system problems
Rely on your teammates and be an active collaborator and participant within the group
Provide event management and support to service owners and IT managers
Author reports, prepare data for status/findings presentations, prepare flowcharts and draft process documents for team activities.
Communicate an honest interpretation of data to all stakeholders; support and facilitate open communication between all stakeholders.
Required Qualifications:
AA/AS and 4+ years of experience or equivalent combination such as bachelor's degree and 4+ years' experience or no degree and at least 5 years in a NOC/TOC engineer, network engineering or IT analyst type roles.
4+ year experience networking and/or application monitoring tools,
Experience with enterprise dashboards and monitoring tools
Dynatrace Monitoring experience prefered
Able to accurately interpret various metrics from monitoring tools.
Strong written and verbal communication
Strong analytical skills and able to collate and interpret data from various sources.
Strong communicator with a natural aptitude for dealing with people
Able to represent Client and the TOC in the most professional manner
Knowledge and experience of system and network infrastructures such as databases, batch job processing, LAN and WAN network technologies, server virtualization, enterprise storage area network (SAN) and backup, enterprise performance and fault monitoring tools. Basic sysadmin skills.
Ability to assess and prioritize faults and respond or escalate accordingly.
Desired Qualifications:
Familiarity with developing and implementing mature operations center processes, tooling and automation
Working knowledge of IT Infrastructure Operations (such as systems and network administration, security, various tools, etc.)
Experience with some or all the following monitoring and reporting tools: Splunk, Dynatrace, SCOM, ITM, SolarWinds, ServiceNow
Experience working in Incident Response as an Incident Resolver
Familiarity with ITIL processes like change management