# Quality fundamentals for stock pool creation
Fundamental data of listed companies is an important evidence reflecting the historical performance of the company's operation and an important basis for investors to judge the future development prospect of the company. Financial analysts and stock investors need to analyse the quality of the company's fundamentals to assess the investment value of the company's stocks. The fundamental data of listed companies obtained from the Tushare platform mainly includes regularly published reports on the company's operating results, profitability, operating capacity, growth capacity, solvency and cash flow, which reflect the company's operating conditions at different levels respectively.
With nearly 3,600 normally traded stocks in China's stock market (Shanghai Stock Exchange and Shenzhen Stock Exchange), it is impossible for the average investor to have enough time and energy to analyse the fundamental data of all listed companies when choosing the ideal investment target. Therefore, it is necessary for investors to automate the screening of stocks in the market using some key indicators with empirical guidance, i.e. relying on a computer programme to screen out stocks with high quality fundamentals from the full range of listed company stocks, thereby significantly improving the efficiency of investors in finding stocks of high quality companies. For example, a company's profitability, growth and cash flow are some of the indicators used to identify quality stocks as candidates for investment.
The basic steps for screening quality stocks are the following 4 steps.
① Use the fundamental data interface function built into the Tushare package to obtain the fundamental data of all stocks.
② Determine the key indicator items reflecting the quality of fundamentals based on experience and extract the data series corresponding to the key indicator items from the fundamentals data of all stocks.
③ Use Pandas built-in function to merge multiple fundamental data series into one DataFrame data.
④ Determine the sorting parameters of the data series according to the importance of the key indicator items and sort the merged data. From the ranking results, the top rows of each indicator are selected, and the stocks corresponding to these rows are the set of stocks with relatively high fundamental quality (i.e. the quality stock pool).
The table below shows the interface functions for obtaining company profitability, growth and cash flow fundamentals and the information on the parameters returned.
From the data items returned by the functions listed in the table above, the interface function returns a very rich variety of data that can be used to reflect the value of the company's investment in a number of combinations of data items. For example, the return parameters from the three types of data listed in the table above select the net interest rate, return on net assets, earnings per share growth rate, net profit growth rate, earnings per share growth rate, the ratio of net operating cash flow to net profit and cash flow ratio as key evaluation indicators, and the comprehensive use of the value of these seven indicators to reflect the quality of the company's fundamentals. Generally, the higher the value of these indicators, the higher the fundamental quality of the company and the greater the investment value of the company's shares. The following program demonstrates a method of using financial indicator data to screen for quality company stocks.
import tushare as ts
import pandas as pd
import datetime
# Get the latest financial statement data. The financial report disclosure time for A-share listed companies in China stipulates that the first quarterly report should be disclosed by April 30, the
# disclose the half-yearly report by August 31, the third quarterly report by October 30, and the annual report by April 30 of the following year.
this_year = datetime.datetime.today().year
this_month = datetime.datetime.today().month
if this_month >= 11: # This year's third quarterly report has been published
fin_year = this_year
fin_sea = 3
elif this_month >= 5: # The previous year's annual report is usually published at the end of April, although the first quarter of the year is also optional
fin_year = this_year-1
fin_sea = 4
else:
fin_year = this_year-1
fin_sea = 3
print("%s year %s quarter" %(fin_year,fin_sea))
printout: 4 quarters of 2019
df1 = ts.get_profit_data(fin_year, fin_sea)
df2 = ts.get_growth_data(fin_year, fin_sea)
df3 = ts.get_cashflow_data(fin_year, fin_sea)
如果在If you have saved the financial data file before running the program, you can read the financial data directly from the local data file, thus avoiding the need to download the same data every time you run the program.
#code, code; name, name; net_profit_ratio, net profit margin (%); roe, return on net assets (%); eps, earnings per share.
#nprg, net profit growth rate (%); nav, net asset growth rate.
df_merge = pd.merge(df1[['code','name', 'net_profit_ratio', 'roe', 'eps']],
df2[['code', 'nprg', 'nav']], on='code', how='left')
#left outer join, left table unrestricted, keep data from left table, match right table, columns in rows not matched by right table are shown as NaN
#cf_nm, ratio of net operating cash flow to net profit; cashflowratio, cash flow ratio.
df_merge = pd.merge(df_merge, df3[['code', 'cf_nm', 'cashflowratio']],
on='code', how='left').dropna() # Delete rows containing NaN
focus_df = df_merge.sort_values(['nprg', 'net_profit_ratio', 'cf_nm', 'nav',
'roe', 'eps', 'cashflowratio'], ascending=False)#nprg is the first keyword
Regarding the order of the key columns for sorting the consolidated statement, the interested reader can make more order adjustments, compare the set of stocks and their sorting in the final retained data table select_df, examine how the results of the sorting operation differ for different indicator orders, and find the corresponding stocks to understand the price changes of the stocks over the last 3 years.
focus_df['code']='\t'+ focus_df['code']# ensure that the code is entered into the csv file in the form of characters, \t is a tab
if focus_df.iloc[:, 0].size > 100:
select_df = focus_df[['code', 'name', 'nprg', 'net_profit_ratio', 'cf_nm', 'nav',
'roe', 'eps', 'cashflowratio']].head(100)
else:
select_df = focus_df[['code', 'name', 'nprg', 'net_profit_ratio', 'cf_nm', 'nav',
'roe', 'eps', 'cashflowratio']]
select_df.to_csv('focus'+str(fin_year)+str(fin_sea)+'.csv',encoding='cp936',index=False)
The disclosure time for financial reports of A-share listed companies in China is stipulated as follows: the first quarterly report is disclosed by 30 April, the half-yearly report by 31 August, the third quarterly report by 30 October, and the annual report is disclosed by 30 April of the following year. The program uses the date function of the datatime package to determine the available financial report data.
From the reality of the stock market, there is no uniform standard for selecting key indicators of listed company stocks with quality fundamentals, partly because there are differences when different financial indicators reflect a company's focus, and partly because the numerical comparability of financial data indicators of companies in different industries is inconclusive. This example only provides a method to improve the efficiency of stock screening of quality companies using a Python program, and does not provide an investment basis for screening quality stocks.