File Inspection (Experimental)¶

by Josh Dillon, Aaron Parsons, and Tyler Cox, last updated September 25, 2022

This notebook is designed to infer as much information about the array from a single file, including pushing the calibration and RFI mitigation as far as possible

Here's a set of links to skip to particular figures and tables:

  • Figure 1: Plot of autocorrelations with classifications
  • Figure 2: Summary of antenna classifications prior to calibration
  • Figure 3: Redundant calibration of a single baseline group
  • Figure 4: $\chi^2$ per antenna across the array
  • Figure 5: Summary of antenna classifications after redundant calibration
  • Table 1: Complete summary of per antenna classifications
In [1]:
import time
tstart = time.time()
In [2]:
import numpy as np
from scipy import constants
import os
import copy
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
pd.set_option('display.max_rows', 1000)
from uvtools.plot import plot_antpos, plot_antclass
from hera_qm import ant_metrics, ant_class, xrfi
from hera_cal import io, utils, redcal, apply_cal, datacontainer, abscal
from IPython.display import display, HTML
import linsolve
display(HTML("<style>.container { width:100% !important; }</style>"))
_ = np.seterr(all='ignore')  # get rid of red warnings
%config InlineBackend.figure_format = 'retina'

Parse inputs¶

To use this notebook interactively, you will have to provide a sum filename path if none exists as an environment variable. All other parameters have reasonable default values.

In [3]:
# get file names
SUM_FILE = os.environ.get("SUM_FILE", None)
# SUM_FILE = '/mnt/sn1/zen.2459797.30001.sum.uvh5'  # If sum_file is not defined in the environment variables, define it here.

SUM_FILE = '/Users/jsdillon/Desktop/zen.2459842.35234.sum.uvh5'

DIFF_FILE = SUM_FILE.replace('sum', 'diff')
PLOT = os.environ.get("PLOT", "TRUE").upper() == "TRUE"
if PLOT:
    %matplotlib inline
print(f"SUM_FILE = '{SUM_FILE}'")
SUM_FILE = '/Users/jsdillon/Desktop/zen.2459842.35234.sum.uvh5'

Parse settings¶

Load settings relating to the operation of the notebook, then print what was loaded (or default).

In [4]:
# parse plotting settings
PLOT = os.environ.get("PLOT", True)
if PLOT:
    %matplotlib inline

# parse omnical settings
OC_MAX_DIMS = int(os.environ.get("OC_MAX_DIMS", 4))
OC_MIN_DIM_SIZE = int(os.environ.get("OC_MIN_DIM_SIZE", 8))
OC_SKIP_OUTRIGGERS = os.environ.get("OC_SKIP_OUTRIGGERS", "TRUE").upper() == "TRUE"
OC_MAXITER = int(os.environ.get("OC_MAXITER", 50))
OC_MAX_RERUN = int(os.environ.get("OC_MAX_RERUN", 4))

# print settings
for setting in ['PLOT', 'OC_MAX_DIMS', 'OC_MIN_DIM_SIZE', 'OC_SKIP_OUTRIGGERS', 'OC_MAXITER', 'OC_MAX_RERUN']:
    print(f'{setting} = {eval(setting)}')
PLOT = True
OC_MAX_DIMS = 4
OC_MIN_DIM_SIZE = 8
OC_SKIP_OUTRIGGERS = True
OC_MAXITER = 100
OC_MAX_RERUN = 4

Parse bounds¶

Load settings related to classifying antennas as good, suspect, or bad, then print what was loaded (or default).

In [5]:
# ant_metrics bounds for low correlation / dead antennas
am_corr_bad = (0, float(os.environ.get("AM_CORR_BAD", 0.3)))
am_corr_suspect = (float(os.environ.get("AM_CORR_BAD", 0.3)), float(os.environ.get("AM_CORR_SUSPECT", 0.5)))

# ant_metrics bounds for cross-polarized antennas
am_xpol_bad = (-1, float(os.environ.get("AM_XPOL_BAD", -0.1)))
am_xpol_suspect = (float(os.environ.get("AM_XPOL_BAD", -0.1)), float(os.environ.get("AM_XPOL_SUSPECT", 0)))

# bounds on zeros in spectra
good_zeros_per_eo_spectrum = (0, int(os.environ.get("MAX_ZEROS_PER_EO_SPEC_GOOD", 2)))
suspect_zeros_per_eo_spectrum = (0, int(os.environ.get("MAX_ZEROS_PER_EO_SPEC_SUSPECT", 8)))

# bounds on autocorrelation power
auto_power_good = (float(os.environ.get("AUTO_POWER_GOOD_LOW", 5)), float(os.environ.get("AUTO_POWER_GOOD_HIGH", 30)))
auto_power_suspect = (float(os.environ.get("AUTO_POWER_SUSPECT_LOW", 1)), float(os.environ.get("AUTO_POWER_SUSPECT_HIGH", 80)))

# bounds on autocorrelation slope
auto_slope_good = (float(os.environ.get("AUTO_SLOPE_GOOD_LOW", -0.2)), float(os.environ.get("AUTO_SLOPE_GOOD_HIGH", 0.2)))
auto_slope_suspect = (float(os.environ.get("AUTO_SLOPE_SUSPECT_LOW", -0.4)), float(os.environ.get("AUTO_SLOPE_SUSPECT_HIGH", 0.4)))

# bounds on autocorrelation RFI
auto_rfi_good = (0, float(os.environ.get("AUTO_RFI_GOOD", 0.05)))
auto_rfi_suspect = (0, float(os.environ.get("AUTO_RFI_SUSPECT", 0.1)))

# bounds on chi^2 per antenna in omnical
oc_cspa_good = (0, float(os.environ.get("OC_CSPA_GOOD", 3)))
oc_cspa_suspect = (float(os.environ.get("OC_CSPA_GOOD", 3)), float(os.environ.get("OC_CSPA_SUSPECT", 4)))

# print bounds
for bound in ['am_corr_bad', 'am_corr_suspect', 'am_xpol_bad', 'am_xpol_suspect', 
              'good_zeros_per_eo_spectrum', 'suspect_zeros_per_eo_spectrum',
              'auto_power_good', 'auto_power_suspect', 'auto_slope_good', 'auto_slope_suspect',
              'auto_rfi_good', 'auto_rfi_suspect', 'oc_cspa_good', 'oc_cspa_suspect']:
    print(f'{bound} = {eval(bound)}')
am_corr_bad = (0, 0.2)
am_corr_suspect = (0.2, 0.4)
am_xpol_bad = (-1, -0.1)
am_xpol_suspect = (-0.1, 0.0)
good_zeros_per_eo_spectrum = (0, 2)
suspect_zeros_per_eo_spectrum = (0, 8)
auto_power_good = (5.0, 30.0)
auto_power_suspect = (1.0, 80.0)
auto_slope_good = (-0.2, 0.2)
auto_slope_suspect = (-0.4, 0.4)
auto_rfi_good = (0, 0.05)
auto_rfi_suspect = (0, 0.1)
oc_cspa_good = (0, 3.0)
oc_cspa_suspect = (3.0, 4.0)

Load sum and diff data¶

In [6]:
hd = io.HERADataFastReader(SUM_FILE)
data, _, _ = hd.read(read_flags=False, read_nsamples=False)
hd_diff = io.HERADataFastReader(DIFF_FILE)
diff_data, _, _ = hd_diff.read(read_flags=False, read_nsamples=False)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 hd = io.HERADataFastReader(SUM_FILE)
      2 data, _, _ = hd.read(read_flags=False, read_nsamples=False)
      3 hd_diff = io.HERADataFastReader(DIFF_FILE)

File ~/mambaforge/envs/RTP/lib/python3.10/site-packages/hera_cal/io.py:1242, in HERADataFastReader.__init__(self, input_data, read_metadata, check, skip_lsts)
   1232 '''Instantiates a HERADataFastReader object. Only supports reading uvh5 files, not writing them.
   1233 Does not support BDA and only supports patial i/o along baselines and polarization axes.
   1234 
   (...)
   1239     skip_lsts (bool, False): save time by not computing LSTs from JDs
   1240 '''
   1241 # parse input_data as filepath(s)
-> 1242 self.filepaths = _parse_input_files(input_data, name='input_data')
   1244 # load metadata only
   1245 rv = {'info': {}}

File ~/mambaforge/envs/RTP/lib/python3.10/site-packages/hera_cal/io.py:53, in _parse_input_files(inputs, name)
     51 for f in filepaths:
     52     if not os.path.exists(f):
---> 53         raise IOError('Cannot find file ' + f)
     54 return filepaths

OSError: Cannot find file /Users/jsdillon/Desktop/zen.2459842.35234.sum.uvh5
In [7]:
ants = sorted(set([ant for bl in hd.bls for ant in utils.split_bl(bl)]))
auto_bls = [bl for bl in data if (bl[0] == bl[1]) and (utils.split_pol(bl[2])[0] == utils.split_pol(bl[2])[1])]
antpols = sorted(set([ant[1] for ant in ants]))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 ants = sorted(set([ant for bl in hd.bls for ant in utils.split_bl(bl)]))
      2 auto_bls = [bl for bl in data if (bl[0] == bl[1]) and (utils.split_pol(bl[2])[0] == utils.split_pol(bl[2])[1])]
      3 antpols = sorted(set([ant[1] for ant in ants]))

NameError: name 'hd' is not defined
In [8]:
# print basic information about the file
print(f'File: {SUM_FILE}')
print(f'JDs: {hd.times} ({np.median(np.diff(hd.times)) * 24 * 3600:.5f} s integrations)')
print(f'LSTS: {hd.lsts * 12 / np.pi } hours')
print(f'Frequencies: {len(hd.freqs)} {np.median(np.diff(hd.freqs)) / 1e6:.5f} MHz channels from {hd.freqs[0] / 1e6:.5f} to {hd.freqs[-1] / 1e6:.5f} MHz')
print(f'Antennas: {len(hd.data_ants)}')
print(f'Polarizations: {hd.pols}')
File: /Users/jsdillon/Desktop/zen.2459842.35234.sum.uvh5
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [8], in <cell line: 3>()
      1 # print basic information about the file
      2 print(f'File: {SUM_FILE}')
----> 3 print(f'JDs: {hd.times} ({np.median(np.diff(hd.times)) * 24 * 3600:.5f} s integrations)')
      4 print(f'LSTS: {hd.lsts * 12 / np.pi } hours')
      5 print(f'Frequencies: {len(hd.freqs)} {np.median(np.diff(hd.freqs)) / 1e6:.5f} MHz channels from {hd.freqs[0] / 1e6:.5f} to {hd.freqs[-1] / 1e6:.5f} MHz')

NameError: name 'hd' is not defined

Classify good, suspect, and bad antpols¶

Run ant_metrics¶

This classifies antennas as cross-polarized, low-correlation, or dead. Such antennas are excluded from any calibration.

In [9]:
am = ant_metrics.AntennaMetrics(SUM_FILE, DIFF_FILE, sum_data=data, diff_data=diff_data)
am.iterative_antenna_metrics_and_flagging(crossCut=am_xpol_bad[1], deadCut=am_corr_bad[1])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [9], in <cell line: 1>()
----> 1 am = ant_metrics.AntennaMetrics(SUM_FILE, DIFF_FILE, sum_data=data, diff_data=diff_data)
      2 am.iterative_antenna_metrics_and_flagging(crossCut=am_xpol_bad[1], deadCut=am_corr_bad[1])

NameError: name 'data' is not defined
In [10]:
# Turn ant metrics into classifications
totally_dead_ants = [ant for ant, i in am.removal_iteration.items() if i == -1]
am_totally_dead = ant_class.AntennaClassification(good=[ant for ant in ants if ant not in totally_dead_ants], bad=totally_dead_ants)
am_corr = ant_class.antenna_bounds_checker(am.final_metrics['corr'], bad=[am_corr_bad], suspect=[am_corr_suspect], good=[(0, 1)])
am_xpol = ant_class.antenna_bounds_checker(am.final_metrics['corrXPol'], bad=[am_xpol_bad], suspect=[am_xpol_suspect], good=[(-1, 1)])
ant_metrics_class = am_totally_dead + am_corr + am_xpol
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [10], in <cell line: 2>()
      1 # Turn ant metrics into classifications
----> 2 totally_dead_ants = [ant for ant, i in am.removal_iteration.items() if i == -1]
      3 am_totally_dead = ant_class.AntennaClassification(good=[ant for ant in ants if ant not in totally_dead_ants], bad=totally_dead_ants)
      4 am_corr = ant_class.antenna_bounds_checker(am.final_metrics['corr'], bad=[am_corr_bad], suspect=[am_corr_suspect], good=[(0, 1)])

NameError: name 'am' is not defined

Classify antennas responsible for 0s in visibilities as bad:¶

This classifier looks for X-engine failure or packet loss specific to an antenna which causes either the even visibilities (or the odd ones, or both) to be 0s.

In [11]:
zeros_class = ant_class.even_odd_zeros_checker(data, diff_data, good=good_zeros_per_eo_spectrum, suspect=suspect_zeros_per_eo_spectrum)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [11], in <cell line: 1>()
----> 1 zeros_class = ant_class.even_odd_zeros_checker(data, diff_data, good=good_zeros_per_eo_spectrum, suspect=suspect_zeros_per_eo_spectrum)

NameError: name 'data' is not defined

Examine and classify autocorrelation power, slope, and RFI occpancy¶

These classifiers look for antennas with too high or low power, to steep a slope, or too much excess RFI.

In [12]:
auto_power_class = ant_class.auto_power_checker(data, good=auto_power_good, suspect=auto_power_suspect)
auto_slope_class = ant_class.auto_slope_checker(data, good=auto_slope_good, suspect=auto_slope_suspect, edge_cut=100, filt_size=17)
auto_rfi_class = ant_class.auto_rfi_checker(data, good=auto_rfi_good, suspect=auto_rfi_suspect)
auto_class = auto_power_class + auto_slope_class + auto_rfi_class
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 auto_power_class = ant_class.auto_power_checker(data, good=auto_power_good, suspect=auto_power_suspect)
      2 auto_slope_class = ant_class.auto_slope_checker(data, good=auto_slope_good, suspect=auto_slope_suspect, edge_cut=100, filt_size=17)
      3 auto_rfi_class = ant_class.auto_rfi_checker(data, good=auto_rfi_good, suspect=auto_rfi_suspect)

NameError: name 'data' is not defined
In [13]:
def autocorr_plot():    
    fig, axes = plt.subplots(1, 2, figsize=(14, 5), dpi=100, sharey=True, gridspec_kw={'wspace': 0})
    labels = []
    colors = ['darkgreen', 'goldenrod', 'maroon']
    for ax, pol in zip(axes, antpols):
        for ant in auto_class.ants:
            if ant[1] == pol:
                color = colors[auto_class.quality_classes.index(auto_class[ant])]
                ax.semilogy(np.mean(data[utils.join_bl(ant, ant)], axis=0), color=color, lw=.5)
        ax.set_xlabel('Channel', fontsize=12)
        ax.set_title(f'{utils.join_pol(pol, pol)}-Polarized Autos')

    axes[0].set_ylabel('Raw Autocorrelation', fontsize=12)
    axes[1].legend([matplotlib.lines.Line2D([0], [0], color=color) for color in colors], 
                   [cls.capitalize() for cls in auto_class.quality_classes], ncol=1, fontsize=12, loc='upper right', framealpha=1)
    plt.tight_layout()

Figure 1: Plot of autocorrelations with classifications¶

This figure shows a plot of all autocorrelations in the array, split by polarization. Antennas are classified based on their autocorrelations into good, suspect, and bad, by examining power, slope, and RFI-occupancy.

In [14]:
if PLOT: autocorr_plot()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [14], in <cell line: 1>()
----> 1 if PLOT: autocorr_plot()

Input In [13], in autocorr_plot()
      3 labels = []
      4 colors = ['darkgreen', 'goldenrod', 'maroon']
----> 5 for ax, pol in zip(axes, antpols):
      6     for ant in auto_class.ants:
      7         if ant[1] == pol:

NameError: name 'antpols' is not defined

Summarize antenna classification prior to redundant-baseline calibration¶

In [15]:
final_class = ant_metrics_class + zeros_class + auto_class
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [15], in <cell line: 1>()
----> 1 final_class = ant_metrics_class + zeros_class + auto_class

NameError: name 'ant_metrics_class' is not defined
In [16]:
def array_class_plot():
    fig, axes = plt.subplots(1, 2, figsize=(14, 6), dpi=100, gridspec_kw={'width_ratios': [2, 1]})
    plot_antclass(hd.antpos, final_class, ax=axes[0], ants=[ant for ant in hd.data_ants if ant < 320], legend=False, title='HERA Core')
    plot_antclass(hd.antpos, final_class, ax=axes[1], ants=[ant for ant in hd.data_ants if ant >= 320], radius=50, title='Outriggers')

Figure 2: Summary of antenna classifications prior to calibration¶

This figure shows the location and classification of all antennas prior to calibration. Antennas are split along the diagonal, with ee-polarized antpols represented by the southeast half of each antenna and nn-polarized antpols represented by the northwest half. Outriggers are split from the core and shown at exaggerated size in the right-hand panel. This classification includes ant_metrics, a count of the zeros in the even or odd visibilities, and autocorrelation power, slope, and RFI occupancy. An antenna classified as bad in any classification will be considered bad. An antenna marked as suspect any in any classification will be considered suspect unless it is also classified as bad elsewhere.

In [17]:
if PLOT: array_class_plot()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 if PLOT: array_class_plot()

Input In [16], in array_class_plot()
      1 def array_class_plot():
      2     fig, axes = plt.subplots(1, 2, figsize=(14, 6), dpi=100, gridspec_kw={'width_ratios': [2, 1]})
----> 3     plot_antclass(hd.antpos, final_class, ax=axes[0], ants=[ant for ant in hd.data_ants if ant < 320], legend=False, title='HERA Core')
      4     plot_antclass(hd.antpos, final_class, ax=axes[1], ants=[ant for ant in hd.data_ants if ant >= 320], radius=50, title='Outriggers')

NameError: name 'hd' is not defined

Perform redundant-baseline calibration¶

In [18]:
def classify_off_grid(reds, all_ants):
    '''Returns AntennaClassification of all_ants where good ants are in reds while bad ants are not.'''
    ants_in_reds = set([ant for red in reds for bl in red for ant in utils.split_bl(bl)])
    on_grid = [ant for ant in all_ants if ant in ants_in_reds]
    off_grid = [ant for ant in all_ants if ant not in ants_in_reds]
    return ant_class.AntennaClassification(good=on_grid, bad=off_grid)

Perform iterative redcal¶

In [19]:
rc_settings = {'fc_conv_crit': 1e-6, 'fc_maxiter': 1, 'fc_min_vis_per_ant': 100, 'max_dims': OC_MAX_DIMS,
               'oc_conv_crit': 1e-10, 'gain': 0.4, 'oc_maxiter': OC_MAXITER, 'check_after': OC_MAXITER}

# figure out and filter reds and classify antennas based on whether or not they are on the main grid
reds = redcal.get_reds(hd.data_antpos, pols=['ee', 'nn'], pol_mode='2pol')
reds = redcal.filter_reds(reds, ex_ants=final_class.bad_ants, max_dims=OC_MAX_DIMS, min_dim_size=1)
if OC_SKIP_OUTRIGGERS:
    reds = redcal.filter_reds(reds, ex_ants=[ant for ant in ants if ant[0] >= 320])
redcal_class = classify_off_grid(reds, ants)

# perform first stage of redundant calibration, 
cal = redcal.redundantly_calibrate(data, reds, **rc_settings)
max_dly = np.max(np.abs(list(cal['fc_meta']['dlys'].values())))
med_cspa = {ant: np.median(cal['chisq_per_ant'][ant]) for ant in cal['chisq_per_ant']}
cspa_class = ant_class.antenna_bounds_checker(med_cspa, good=np.array(oc_cspa_good)*2, suspect=np.array(oc_cspa_suspect)*2, bad=(0, np.inf))
redcal_class += cspa_class

# iteratively rerun redundant calibration
for i in range(OC_MAX_RERUN):
    # build RedDataContainer of old visibility solution
    prior_vis = datacontainer.RedDataContainer(cal['v_omnical'], reds)
    
    # refilter reds and update classification to reflect new off-grid ants, if any
    reds = redcal.filter_reds(reds, ex_ants=(final_class + redcal_class).bad_ants, max_dims=OC_MAX_DIMS, min_dim_size=1)
    reds = sorted(reds, key=len, reverse=True)
    redcal_class += classify_off_grid(reds, ants)
    ants_in_reds = set([ant for red in reds for bl in red for ant in utils.split_bl(bl)])    
   
    # re-run redundant calibration using previous solution, updating bad and suspicious antennas
    prior_sol = redcal.RedSol(reds, gains={ant: cal['g_omnical'][ant] for ant in ants_in_reds}, 
                              vis={red[0]: prior_vis[red[0]] for red in reds})    
    cal = redcal.redundantly_calibrate(data, reds, prior_firstcal=prior_sol, prior_sol=prior_sol, **rc_settings)
    med_cspa = {ant: np.median(cal['chisq_per_ant'][ant]) for ant in cal['chisq_per_ant']}
    cspa_class = ant_class.antenna_bounds_checker(med_cspa, good=oc_cspa_good, suspect=oc_cspa_suspect, bad=(0, np.inf))    
    redcal_class += cspa_class
    if len(cspa_class.bad_ants) == 0:
        break  # no new antennas to flag
final_class += redcal_class
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [19], in <cell line: 5>()
      1 rc_settings = {'fc_conv_crit': 1e-6, 'fc_maxiter': 1, 'fc_min_vis_per_ant': 100, 'max_dims': OC_MAX_DIMS,
      2                'oc_conv_crit': 1e-10, 'gain': 0.4, 'oc_maxiter': OC_MAXITER, 'check_after': OC_MAXITER}
      4 # figure out and filter reds and classify antennas based on whether or not they are on the main grid
----> 5 reds = redcal.get_reds(hd.data_antpos, pols=['ee', 'nn'], pol_mode='2pol')
      6 reds = redcal.filter_reds(reds, ex_ants=final_class.bad_ants, max_dims=OC_MAX_DIMS, min_dim_size=1)
      7 if OC_SKIP_OUTRIGGERS:

NameError: name 'hd' is not defined

Fix the firstcal delay slope degeneracy using RFI transmitters¶

In [20]:
# find channels clearly contaminated by RFI
not_bad_ants = [ant for ant in final_class.ants if final_class[ant] != 'bad']
chan_flags = np.mean([xrfi.detrend_medfilt(data[utils.join_bl(ant, ant)], Kf=8, Kt=2) for ant in not_bad_ants], axis=(0, 1)) > 5

# hardcoded RFI transmitters and their headings
# channel: frequency (Hz), heading (rad), chi^2
phs_sol = {359: ( 90744018.5546875, 0.7853981, 23.3),
           360: ( 90866088.8671875, 0.7853981, 10.8),
           385: ( 93917846.6796875, 0.7853981, 27.3),
           386: ( 94039916.9921875, 0.7853981, 18.1),
           400: ( 95748901.3671875, 6.0632738, 24.0),
           441: (100753784.1796875, 0.7853981, 21.7),
           442: (100875854.4921875, 0.7853981, 19.4),
           455: (102462768.5546875, 6.0632738, 18.8),
           456: (102584838.8671875, 6.0632738,  8.8),
           471: (104415893.5546875, 0.7853981, 13.3),
           484: (106002807.6171875, 6.0632738, 21.2),
           485: (106124877.9296875, 6.0632738,  4.0),
          1181: (191085815.4296875, 0.7853981, 26.3),
          1182: (191207885.7421875, 0.7853981, 27.0),
          1183: (191329956.0546875, 0.7853981, 25.6),
          1448: (223678588.8671875, 2.6075219, 25.7),
          1449: (223800659.1796875, 2.6075219, 22.6),
          1450: (223922729.4921875, 2.6075219, 11.6),
          1451: (224044799.8046875, 2.6075219,  5.9),
          1452: (224166870.1171875, 2.6075219, 22.6),
          1510: (231246948.2421875, 0.1068141, 23.9)}
rfi_chans = [chan for chan in phs_sol if chan_flags[chan]]
print('Channels used for delay-slope calibration with RFI:', rfi_chans)
rfi_angles = np.array([phs_sol[chan][1] for chan in rfi_chans])
rfi_headings = np.array([np.cos(rfi_angles), np.sin(rfi_angles), np.zeros_like(rfi_angles)])
rfi_chisqs = np.array([phs_sol[chan][2] for chan in rfi_chans])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [20], in <cell line: 2>()
      1 # find channels clearly contaminated by RFI
----> 2 not_bad_ants = [ant for ant in final_class.ants if final_class[ant] != 'bad']
      3 chan_flags = np.mean([xrfi.detrend_medfilt(data[utils.join_bl(ant, ant)], Kf=8, Kt=2) for ant in not_bad_ants], axis=(0, 1)) > 5
      5 # hardcoded RFI transmitters and their headings
      6 # channel: frequency (Hz), heading (rad), chi^2

NameError: name 'final_class' is not defined
In [21]:
# resolve firstcal degeneracy with delay slopes set by RFI transmitters, update cal
RFI_dly_slope_gains = abscal.RFI_delay_slope_cal(reds, hd.antpos, cal['v_omnical'], hd.freqs, rfi_chans, rfi_headings, rfi_wgts=rfi_chisqs**-.5,
                                                 min_tau=-max_dly, max_tau=max_dly, delta_tau=0.1e-9, return_gains=True, gain_ants=cal['g_omnical'].keys())
cal['g_omnical'] = {ant: g * RFI_dly_slope_gains[ant] for ant, g in cal['g_omnical'].items()}
apply_cal.calibrate_in_place(cal['v_omnical'], RFI_dly_slope_gains)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [21], in <cell line: 2>()
      1 # resolve firstcal degeneracy with delay slopes set by RFI transmitters, update cal
----> 2 RFI_dly_slope_gains = abscal.RFI_delay_slope_cal(reds, hd.antpos, cal['v_omnical'], hd.freqs, rfi_chans, rfi_headings, rfi_wgts=rfi_chisqs**-.5,
      3                                                  min_tau=-max_dly, max_tau=max_dly, delta_tau=0.1e-9, return_gains=True, gain_ants=cal['g_omnical'].keys())
      4 cal['g_omnical'] = {ant: g * RFI_dly_slope_gains[ant] for ant, g in cal['g_omnical'].items()}
      5 apply_cal.calibrate_in_place(cal['v_omnical'], RFI_dly_slope_gains)

NameError: name 'reds' is not defined

TODO: Perform approximate absolute amplitude calibration using a model of autocorrelations¶

In [22]:
# TODO
In [23]:
def redundant_group_plot():
    fig, axes = plt.subplots(2, 2, figsize=(14, 6), dpi=100, sharex='col', sharey='row', gridspec_kw={'hspace': 0, 'wspace': 0})
    for i, pol in enumerate(['ee', 'nn']):
        reds_here = redcal.get_reds(hd.data_antpos, pols=[pol], pol_mode='1pol')
        red = sorted(redcal.filter_reds(reds_here, ex_ants=final_class.bad_ants), key=len, reverse=True)[0]
        rc_data = {bl: np.array(data[bl]) for bl in red}
        apply_cal.calibrate_in_place(rc_data, cal['g_omnical'])
        for bl in red:
            axes[0, i].plot(hd.freqs/1e6, np.angle(np.mean(rc_data[bl], axis=0)), alpha=.5, lw=.5)
            axes[1, i].semilogy(hd.freqs/1e6, np.abs(np.mean(rc_data[bl], axis=0)), alpha=.5, lw=.5)
        axes[0, i].plot(hd.freqs / 1e6, np.angle(np.mean(cal['v_omnical'][red[0]], axis=0)), lw=2, c='k')
        axes[1, i].semilogy(hd.freqs / 1e6, np.abs(np.mean(cal['v_omnical'][red[0]], axis=0)), lw=2, c='k', label=f'Baseline Group:\n{red[0]}')

        axes[1, i].set_xlabel('Frequency (MHz)')
        axes[1, i].legend(loc='upper right')
    axes[0, 0].set_ylabel('Visibility Phase')
    axes[1, 0].set_ylabel('Visibility Amplitude')
    plt.tight_layout()

Figure 3: Redundant calibration of a single baseline group¶

The results of a redundant-baseline calibration of a single group, the one with the highest redundancy in each polarization after antenna classification and excision based on the above, plus the removal of antennas with high $\chi^2$ per antenna. The black line is the redundant visibility solution. Each thin colored line is a different baseline group. Phases are shown in the top row, amplitudes in the bottom, ee-polarized visibilities in the left column, and nn-polarized visibilities in the right.

In [24]:
if PLOT: redundant_group_plot()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [24], in <cell line: 1>()
----> 1 if PLOT: redundant_group_plot()

Input In [23], in redundant_group_plot()
      2 fig, axes = plt.subplots(2, 2, figsize=(14, 6), dpi=100, sharex='col', sharey='row', gridspec_kw={'hspace': 0, 'wspace': 0})
      3 for i, pol in enumerate(['ee', 'nn']):
----> 4     reds_here = redcal.get_reds(hd.data_antpos, pols=[pol], pol_mode='1pol')
      5     red = sorted(redcal.filter_reds(reds_here, ex_ants=final_class.bad_ants), key=len, reverse=True)[0]
      6     rc_data = {bl: np.array(data[bl]) for bl in red}

NameError: name 'hd' is not defined

Attempt to calibrate some flagged antennas¶

This attempts to calibrate bad antennas using information from good or suspect antennas without allowing bad antennas to affect their calibration. However antennas flagged for ant_metrics or lots of zeros in the even or odd visibilities are considered beyond saving. Likewise, some antennas would add extra degeneracies (controlled by OC_MAX_DIMS, OC_MIN_DIM_SIZE, and OC_SKIP_OUTRIGGERS) are excluded.

In [25]:
expanded_reds = redcal.get_reds(hd.data_antpos, pols=['ee', 'nn'], pol_mode='2pol')
expanded_reds = redcal.filter_reds(expanded_reds, ex_ants=(ant_metrics_class + zeros_class).bad_ants, max_dims=OC_MAX_DIMS, min_dim_size=OC_MIN_DIM_SIZE)
if OC_SKIP_OUTRIGGERS:
    expanded_reds = redcal.filter_reds(expanded_reds, ex_ants=[ant for ant in ants if ant[0] >= 320])
nsamples = datacontainer.DataContainer({bl: np.ones_like(data[bl], dtype=float) for bl in data})
redcal.expand_omni_sol(cal, expanded_reds, data, nsamples)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [25], in <cell line: 1>()
----> 1 expanded_reds = redcal.get_reds(hd.data_antpos, pols=['ee', 'nn'], pol_mode='2pol')
      2 expanded_reds = redcal.filter_reds(expanded_reds, ex_ants=(ant_metrics_class + zeros_class).bad_ants, max_dims=OC_MAX_DIMS, min_dim_size=OC_MIN_DIM_SIZE)
      3 if OC_SKIP_OUTRIGGERS:

NameError: name 'hd' is not defined
In [26]:
def array_chisq_plot():
    fig, axes = plt.subplots(1, 2, figsize=(14, 5), dpi=100)
    for ax, pol in zip(axes, ['ee', 'nn']):
        ants_to_plot = set([ant for ant in cal['chisq_per_ant'] if utils.join_pol(ant[1], ant[1]) == pol])
        cspas = [np.median(cal['chisq_per_ant'][ant]) for ant in ants_to_plot]
        xpos = [hd.antpos[ant[0]][0] for ant in ants_to_plot]
        ypos = [hd.antpos[ant[0]][1] for ant in ants_to_plot]
        scatter = ax.scatter(xpos, ypos, s=300, c=cspas, norm=matplotlib.colors.LogNorm(vmin=1, vmax=10))
        for ant in ants_to_plot:
            ax.text(hd.antpos[ant[0]][0], hd.antpos[ant[0]][1], ant[0], va='center', ha='center', fontsize=9,
                    c=('r' if ant in final_class.bad_ants else 'w'))
        plt.colorbar(scatter, ax=ax, extend='both')
        ax.axis('equal')
        ax.set_xlabel('East-West Position (meters)')
        ax.set_ylabel('North-South Position (meters)')
        ax.set_title(f'{pol}-pol $\\chi^2$ / Antenna (Red is Flagged)')
    plt.tight_layout()

Figure 4: $\chi^2$ per antenna across the array¶

This plot shows median (taken over time and frequency) of the normalized $\chi^2$ per antenna. The expectation value for this quantity when the array is perfectly redundant is 1.0. Antennas that are classified as bad for any reason have their numbers shown in red. Some of those antennas were classified as bad during redundant calibration for high $\chi^2$. Some of those antennas were originally excluded from redundant calibration because they were classified as bad earlier for some reason. See here for more details. Note that the color scale saturates at below 1 and above 10.

In [27]:
if PLOT: array_chisq_plot()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [27], in <cell line: 1>()
----> 1 if PLOT: array_chisq_plot()

Input In [26], in array_chisq_plot()
      2 fig, axes = plt.subplots(1, 2, figsize=(14, 5), dpi=100)
      3 for ax, pol in zip(axes, ['ee', 'nn']):
----> 4     ants_to_plot = set([ant for ant in cal['chisq_per_ant'] if utils.join_pol(ant[1], ant[1]) == pol])
      5     cspas = [np.median(cal['chisq_per_ant'][ant]) for ant in ants_to_plot]
      6     xpos = [hd.antpos[ant[0]][0] for ant in ants_to_plot]

NameError: name 'cal' is not defined
In [28]:
def array_class_after_redcal_plot():
    fig, axes = plt.subplots(1, 2, figsize=(14, 6), dpi=100, gridspec_kw={'width_ratios': [2, 1]})
    plot_antclass(hd.antpos, final_class, ax=axes[0], ants=[ant for ant in hd.data_ants if ant < 320], legend=False, title='HERA Core, Post-Redcal')
    plot_antclass(hd.antpos, final_class, ax=axes[1], ants=[ant for ant in hd.data_ants if ant >= 320], radius=50, title='Outriggers')

Figure 5: Summary of antenna classifications after redundant calibration¶

This figure is the same as Figure 2, except that it now includes additional suspect or bad antennas based on redundant calibration. This can include antennas with high $\chi^2$, but it can also include antennas classified as "bad" because they would add extra degeneracies to calibration.

In [29]:
if PLOT: array_class_after_redcal_plot()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [29], in <cell line: 1>()
----> 1 if PLOT: array_class_after_redcal_plot()

Input In [28], in array_class_after_redcal_plot()
      1 def array_class_after_redcal_plot():
      2     fig, axes = plt.subplots(1, 2, figsize=(14, 6), dpi=100, gridspec_kw={'width_ratios': [2, 1]})
----> 3     plot_antclass(hd.antpos, final_class, ax=axes[0], ants=[ant for ant in hd.data_ants if ant < 320], legend=False, title='HERA Core, Post-Redcal')
      4     plot_antclass(hd.antpos, final_class, ax=axes[1], ants=[ant for ant in hd.data_ants if ant >= 320], radius=50, title='Outriggers')

NameError: name 'hd' is not defined
In [30]:
to_show = {'Antenna': [f'{ant[0]}{ant[1][-1]}' for ant in ants]}
classes = {'Antenna': [final_class[ant] if ant in final_class else '-' for ant in ants]}
to_show['Dead?'] = [{'good': 'No', 'bad': 'Yes'}[am_totally_dead[ant]] if (ant in am_totally_dead) else '' for ant in ants]
classes['Dead?'] = [am_totally_dead[ant] if (ant in am_totally_dead) else '' for ant in ants]
for title, ac in [('Low Correlation', am_corr),
                  ('Cross-Polarized', am_xpol),
                  ('Even/Odd Zeros', zeros_class),
                  ('Autocorr Power', auto_power_class),
                  ('Autocorr Slope', auto_slope_class),
                  ('RFI in Autos', auto_rfi_class)]:
    to_show[title] = [f'{ac._data[ant]:.2G}' if (ant in ac._data) else '' for ant in ants]
    classes[title] = [ac[ant] if ant in ac else '' for ant in ants]
    
to_show['Redcal chi^2'] = [f'{np.median(cal["chisq_per_ant"][ant]):.3G}' if (ant in cal['chisq_per_ant']) else '-' for ant in ants]
classes['Redcal chi^2'] = [redcal_class[ant] if ant in redcal_class else '' for ant in ants]

df = pd.DataFrame(to_show)
df2 = pd.DataFrame(classes)
colors = {'good': 'darkgreen', 'suspect': 'goldenrod', 'bad': 'maroon'}
df2 = df2.applymap(lambda x: f'background-color: {colors.get(x, None)}')

table = df.style.hide_index() \
                .apply(lambda x: pd.DataFrame(df2.values, columns=x.columns), axis=None) \
                .set_properties(subset=['Antenna'], **{'font-weight': 'bold', 'border-right': "3pt solid black"}) \
                .set_properties(subset=df.columns[1:], **{'border-left': "1pt solid black"}) \
                .set_properties(**{'text-align': 'center', 'color': 'white'}) \
                .set_sticky(axis=1)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [30], in <cell line: 1>()
----> 1 to_show = {'Antenna': [f'{ant[0]}{ant[1][-1]}' for ant in ants]}
      2 classes = {'Antenna': [final_class[ant] if ant in final_class else '-' for ant in ants]}
      3 to_show['Dead?'] = [{'good': 'No', 'bad': 'Yes'}[am_totally_dead[ant]] if (ant in am_totally_dead) else '' for ant in ants]

NameError: name 'ants' is not defined

Table 1: Complete summary of per-antenna classifications¶

This table summarizes the results of the various classifications schemes detailed above. As before, green is good, yellow is suspect, and red is bad. The color for each antenna (first column) is the final summary of all other classifications. Antennas missing from redcal $\chi^2$ were excluded redundant-baseline calibration, either because they were flagged by ant_metrics or the even/odd zeros check, or because they would add unwanted extra degeneracies.

In [31]:
HTML(table.render())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [31], in <cell line: 1>()
----> 1 HTML(table.render())

NameError: name 'table' is not defined

TODO: Find and flag RFI¶

TODO: Perfrom nucal¶

Metadata¶

In [32]:
from hera_cal import __version__
print('hera_cal:', __version__)
from hera_qm import __version__
print('hera_qm:', __version__)
hera_cal: 3.1.5.dev78+gda49a16
hera_qm: 2.0.4.dev9+g5da028f
In [33]:
print(f'Finished execution in {(time.time() - tstart) / 60:.2f} minutes.')
Finished execution in 0.21 minutes.